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(57) Abstract: In Caenorhabditis elegans, lin-4 and let-7 encode 22- and 21 -nucleotide RNAs, respectively, that function as key 
regulators of developmental tinning. Because the appearance of these short RNAs is regulated during development, they are also 
referred to as ** small temporal RNAs" (slRNAs). We show that many more 21- and 22-nl expressed RNAs, termed microRNAs, 
^ (nuRNAs), exist in invertebrates and vertebrates, and that some of these novel RNAs, similar to iet-7 stRNA, are also highly con- 
served. This suggests that sequence-specific post-transcriptional regulatory mechanisms mediated by small RNAs are more general 
^ than previously appreciated. 
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MicroRNA molecules 

Description 

The present invention relates to novel small expressed {micro)RNA 
molecules associated with physiological regulatory mechanisms, 
particularly in developmental control. 

In Caenorhabditis elegans, lin-4 and let-7 encode 22- and 21 -nucleotide 
RNAs, respectively (1, 2), that function as key regulators of developmental 
timing (3-5). Because the appearance of these short RNAs is regulated 
during development they are also referred to as "microRNAs" (miRNAs) or 
small temporal RNAs (stRNAs) (6), lin-4 and let-21 are the only known 
miRNAs to date. 

Two distinct pathways exist in animals and plants in which 21- to 23- 
nucleotide RNAs function as post-transcriptional regulators of gene 
expression. Small interfering RNAs (siRNAs) act as mediators of sequence- 
specific mRNA degradation in RNA interference (RNAi) (7-11) whereas 
miRNAs regulate developmental timing by mediating sequence-specific 
repression of mRNA translation (3-5). siRNAs and miRNAs are excised from 
double-stranded RNA (dsRNA) precursors by Dicer (12, 13, 29), a 
multidomain RNase 111 protein, thus producing RNA species of similar size. 
However, siRNAs are believed to be double-stranded (8, 11, 12), while 
miRNAs are single-stranded (6). 

We show that many more short, particularly 21- and 22-nt expressed 
RNAs, termed microRNAs (miRNAs), exist in invertebrates and vertebrates, 
and that some of these novel RNAs, similar to let-7 RNA (6), are also 
highly conserved. This suggests that sequence-specific post-transcriptional 
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regulatory mechanisms mediated by small RNAs are more general than 
previously appreciated. 

The present invention relates to an isolated nucleic acid molecule 
comprising: 

(a) a nucleotide sequence as shown in Table 1 , Table 2, Table 3 
or Table 4 



(b) a nucleotide sequence which is the complement of (a). 



(c) a nucleotide sequence which has an identity of at least 80%, 
preferably of at least 90% and more preferably of at least 
99%, to a sequence of (a) or (b) and/or 



(d) a nucleotide sequence which hybridizes under stringent 
conditions to a sequence of (a), (b) and/or (c). 



In a preferred embodiment the invention relates to miRNA molecules and 
analogs thereof, to miRNA precursor molecules and to DNA molecules 
encoding miRNA or miRNA precursor molecules. 



Preferably the identity of sequence (c) to a sequence of (a) or (b) is at least 
90%, more preferably at least 95%. The determination of identity (percent) 
may be carried out as follows: 



I = n : L 



wherein I is the identity in percent, n is the number of identical nucleotides 
between a given sequence and a comparative sequence as shown in Table 
1, Table 2, Table 3 or Table 4 and L is the length of the comparative 
sequence. It should be noted that the nucleotides A, C, G and U as 
depicted in Tables 1, 2, 3 and 4 may denote ribonucleotides. 
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deoxyribonucleotides and/or other nucleotide analogs, e.g. synthetic non- 
naturally occurring nucleotide analogs. Further nucleobases may be 
substituted by corresponding nucleobases capable of fornning analogous H- 
bonds to a complementary nucleic acid sequence, e.g. U may be 
substituted by T. 

Further, the invention encompasses nucleotide sequences which hybridize 
under stringent conditions with the nucleotide sequence as shown in Table 
1, Table 2, Table 3 or Table 4, a complementary sequence thereof or a 
highly identical sequence. Stringent hybridization conditions comprise 
washing for 1 h in 1 x SSC and 0. 1 % SDS at 45°C, preferably at 48°C and 
more preferably at 50°C, particularly for 1 h in 0.2 x SSC and 0.1 % SDS. 

the isolated nucleic acid molecules of the invention preferably have a 
length of from 18 to 100 nucleotides, and more preferably from 18 to 80 
nucleotides. It should be noted that mature miRNAs usually have a length 
of 19-24 nucleotides, particularly 21, 22 or 23 nucleotides. The miRNAs, 
however, may be also provided as a precursor which usually has a length 
of 50-90 nucleotides, particularly 60-80 nucleotides. It should be noted 
that the precursor may be produced by processing of a primary transcript 
which may have a length of > 100 nucleotides. 

The nucleic acid molecules may be present in single-stranded or double- 
stranded form. The miRNA as such is usually a single-stranded molecule, 
while the mi-precursor is usually an at least partially self-complementary 
molecule capable of forming double-stranded portions, e.g. stem- and loop- 
structures. DNA molecules encoding the miRNA and miRNA precursor 
molecules. The nucleic acids may be selected from RNA, DNA or nucleic 
acid analog molecules, such as sugar- or backbone-modified ribonu- 
cleotides or deoxyribonucleotides. It should be noted, however, that other 
nucleic analogs, such as peptide nucleic acids (PNA) or locked nucleic 
acids (LNA), are also suitable. 
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In an embodiment of the invention the nucleic acid molecule is an RNA- or 
DNA molecule, which contains at least one modified nucleotide analog, i.e. 
a naturally occurring ribonucleotide or deoxyribonucleotide is substituted 
by a non-naturaliy occurring nucleotide. The modified nucleotide analog 
may be located for example at the 5'-end and/or the 3'-end of the n.ucle.ic 
acid molecule. 

Preferred nucleotide analogs are selected from sugar- or backbone-modified 
ribonucleotides. It should be noted, however, that also nucleobase- 
• modified ribonucleotides, i.e.. ribonucleotides, containing a non-naturglly 
occurring nucleobase instead of a naturally occurring nucleobase.such as 
uridines or cytidines modified at the 5-position, e.g. 5-(2-amino)propyl 
uridine, 5-bromo uridine; adenosines and guanosines modified at the 8- 
position, e.g. 8-bromo guanosine; deaza nucleotides, e.g. 7-deaza- 
adenosine; O- and N-alkylated nucleotides, e.g. N6-methyl adenosine are 
suitable.. In preferred sugar-modified ribonucleotides the 2'-OH-group is 
replaced by a group selected from H, OR, R, halo, SH, SR, NHj, NHR, NRj 
or CN, wherein R is Ci-Cg alkyi, alkenyl or alkynyl and halo is F, CI, Br or I. 
In preferred backbone-modified ribonucleotides the phosphoester group 
connecting to adjacent ribonucleotides is replaced by a modified group, 
e.g. of phosphothioate group. It should be noted that the above 
modifications may be combined. 

The nucleic acid molecules of the invention may be obtained by chemical 
synthesis methods or by recombinant methods, e.g. by enzymatic 
transcription from synthetic DNA-templates or from DNA-pIasmids isolated 
from recombinant organisms. Typically phage RNA-polymerases are used 
for transcription, such as T7, T3 or SP6 RNA-polymerases. 

The invention also relates to a recombinant expression vector comprising 
a recombinant nucleic acid operatively linked to an expression control 
sequence, wherein expression, i.e. transcription and optionally further 
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processing results in a miRNA-molecule or nniRNA precursor molecule as 
described above. The vector is preferably a DNA-vector, e.g. a viral vector 
or a plasmid, particularly an expression vector suitable for nucleic acid 
expression in eukaryotic, more particularly mammalian cells. The 
recombinant nucleic acid contained in said vector may be a sequence 
which results in the transcription of the miRNA-molecule as such, a 
precursor or a primary transcript thereof, which may be further processed 
to give the miRNA-molecule. 

Further, the invention relates to diagnostic or therapeutic applications of • 
the claimed nucleic acid molecules. For example, miRNAs may be detected: 
in biological samples, e.g. in tissue sections, in order to determine and 
classify certain cell types or tissue types or miRNA-associated pathogenic 
disorders which are characterized by differential expression of miRNA- 
molecules or miRNA-molecule patterns. Further, the developmental stage 
of cells may be classified by determining temporarily expressed miRWA- 
molecules. 

Further, the claimed nucleic acid molecules are suitable for therapeutic 
applications. For example, the nucleic acid molecules may be used as 
modulators or targets of developmental processes or disorders associated 
with developmental dysfunctions, such as cancer. For example, miR-15 
and miR-1 6 probably function as tumor-suppressors and thus expression or 
delivery of these RNAs or analogs or precursors thereof to tumor cells may 
provide therapeutic efficacy, particularly against leukemias, such as B-cell 
chronic lymphocytic leukemia (B-CLL). Further, miR-1 0 is a possible 
regulator of the translation of Hox Genes, particularly Hox 3 and Hox 4 (or 
Scr and Dfd in Drosophila). 

In general, the claimed nucleic acid molecules may be used as a modulator 
of the expression of genes which are at least partially complementary to 
said nucleic acid. Further, miRNA molecules may act as target for 
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therapeutic screening procedures, e.g, inhibition or activation of miRNA 
molecules might modulate a cellular differentiation process, e.g. apoptosis. 

Furthermore, existing miRNA molecules may be used as starting materials 
for the manufacture of sequence-modified miRNA molecules, in order to 
modify the target-specificity thereof, e.g. an oncogene, a multidrug- 
resistance gene or another therapeutic target gene. The novel engineered 
miRNA molecules preferably have an identity of at least 80% to the 
starting miRNA, e.g. as depicted in Tables 1, 2, 3 and 4. Further,. miRNA 
molecules can be modified, in order that they are symetrically processed 
and then generated as double-stranded siRNAs which are again directed 
against therapeutically relevant targets. 

Furthermore, miRNA molecules may be used for tissue reprogramming 
procedures, e.g. a differentiated cell line might be transformed by 
expression of miRNA molecules into a different cell type or a stem cell. 

For diagnostic or therapeutic applications, the claimed RNA molecules are 
preferably provided as a pharmaceutical composition. This pharmaceutical 
composition comprises as an active agent at least one nucleic acid 
molecule as described above and optionally a pharmaceutically acceptable 
carrier. 

The administration of the pharmaceutical composition may be carried out 
by known methods, wherein a nucleic acid is introduced into a desired 
target cell in vitro or in vivo. 

Commonly used gene transfer techniques include calcium phosphate, 
DEAE-dextran, electroporation and microinjection and viral methods [30, 
31, 32, 33, 34]. A recent addition to this arsenal of techniques for the 
introduction of DNA into cells is the use of cationic liposomes [35]. 
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Commercially available cationic lipid formulations are e.g. Tfx 50 (Promega) 
or Lipofectamin 2000 (Life Technologies). 

The composition may be in form of a solution, e.g. an injectable solution, 
a cream, ointment, tablet, suspension or the like. The composition may be 
administered in any suitable way, e.g. by injection, by oral, topical, nasal, 
rectal application etc. The carrier may be any suitable pharmaceutical 
carrier. Preferably, a carrier is used, which is capable of increasing the 
efficacy of the RNA molecules to enter the target-cells. Suitable examples 
of such carriers are liposomes, particularly cationic liposomes. 

Further, the invention relates to a method of identifying novel microRNA- 
molecules and precursors thereof, in eukaryotes, particularly in vertebrates 
and more particularly in mammals, such as humans or mice. This method 
comprises; ligating 5'- and 3'-adapter-molecules to the end of a size- 
fractionated RNA-population, reverse transcribing said adapter-ligated RNA- 
population, and characterizing said reverse transcribed RNA-molecules, e.g. 
by amplification, concatamerization, cloning and sequencing. 

A method as described above already has been described in (8), however, 
for the identification of siRNA molecules. Surprisingly, it was found now 
that the method is also suitable for identifying the miRNA molecules or 
precursors thereof as claimed in the present application. 

Further, it should be noted that as 3'-adaptor for derivatization of the 3 
OH group not only 4-hydroxymethylbenzyl but other types of derivatization 
groups, such as alkyi, alkyi amino, ethylene glycol or 3'-deoxy groups are 
suitable. 

Further, the invention shall be explained in more detail by the following 
Figures and Examples: 
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Figure Legends 

Fig, 1A. Expression of D. melanogaster miRNAs. Northern blots of total 
RNA isolated from staged populations of D, melanogaster were probed for 
the indicated miRNAs. The position of 76-nt val-tRNA is also indicated on 
the blots. 5S rRNA serves as loading control. E, embryo; L, larval stage; P, 
pupae; A, adult; S2, Schneider-2 cells. It should be pointed out, that S2 
cells are polyclonal, derived from an unknown subset of embryonic tissues, 
and may have also lost some features of their tissue of origin while 
maintained in culture. miR-3 to miR-6 RNAs were not detectable in S2 cells 
(data not shown). miR-14 was not detected by Northern blotting and may 
be very weakly expressed, which is consistent with its cloning frequency. 
Similar miRNA sequences are difficult to distinguish by Northern blotting 
because of potential cross-hybridization of probes. 

Fig. IB.* Expression of vertebrate miRNAs. Northern blots of total RNA 
isolated from HeLa cells, mouse kidneys, adult zebrafish, frog ovaries, and 
S2 cells were probed for the indicated miRNAs. The position of 76-nt 
val-tRNA is also indicated on the blots. 5S rRNA from the preparations of 
total RNA from the indicated species is also shown. The gels used for 
probing of miR-18, miR-19a, miR-30, and miR-31 were not run as far as 
the other gels (see tRNA marker position). miR-32 and miR-33 were not 
detected by Northern blotting, which is consistent with their low cloning 
frequency. Oligodeoxynucleotides used as Northern probes were: 
let-7a, 5' TACTATACAACCTACTACCTCAATTTGCC (SEQ ID N0:1); 
let"7d, 5' ACTATGCAACCTACTACCTCT (SEQ ID N0:2); 
let-7e, 5' ACTATACAACCTCCTACCTCA (SEQ ID N0:3); 
D. me/anogasterval'XmA, 5 ' TGGTGTTTCCGCCCGGGAA (SEQ ID N0:4); 
miR-1, 5' TGGAATGTAAAGAAGTATGGAG (SEQ ID N0:5); 
miR-2b, 5' GCTCCTCAAAGCTGGCTGTGATA (SEQ ID N0:6); 
miR-3, 5' TGAGACACACTTTGCCCAGTGA (SEQ ID N0:7); 
miR-4, 5' TCAATGGTTGTCTAGCTTTAT (SEQ ID N0:8); 
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5' CATATCACAACGATCGTTCCTTT (SEQ ID N0:9); 
R-6, 5' AAAAAGAACAGCCACTGTGATA (SEQ ID N0:10); 
R-7, 5' TGGAAGACTAGTGATTTTGTTGT (SEQ ID N0:1 1); 
R-8, 5' GACATCTTTACCTGACAGTATTA (SEQ ID N0:12); 
R-9, 5' TCATACAGCTAGATAACCAAAGA (SEQ ID NO:13); 
R-10, 5' ACAAATTCGGATCTACAGGGT (SEQ ID N0:14); 
R-11, 5' GCAAGAACTCAGACTGTGATG (SEQ ID N0:15); 
R-12, 5' ACCAGTACCTGATGTAATACTCA (SEQ ID N0:16); 
R-13a, 5' ACTCGTCAAAATGGCTGTGATA (SEQ ID NO:17); 
R-14, 5' TAGGAGAGAGAAAAAGACTGA (SEQ ID N0:18); - 
R-1 5, 5 ' TAGCAGCACATAATGGTTTGT (SEQ ID N0:1 9); 
R-16, 5' GCCAATATTTACGTGCTGCTA (SEQ ID N0:20); 
R-1 7, 5' TACAAGTGCCTTCACTGCAGTA (SEQ ID N0:21); 
R-1 8, 5' TATCTGCACTAGATGCACCTTA (SEQ ID NO:22); 
R-1 9a, 5' TCAGTTTTGCATAGATTTGCACA (SEQ ID N0:23); 
R-20, 5' TACCTGCACTATAAGCACTTTA (SEQ ID N0:24); 
R-21, 5' TCAACATCAGTCTGATAAGCTA (SEQ ID NO:25); 
R-22, 5 ' ACAGTTCTTCAACTGGCAGCTT (SEQ ID N0:26); 
R-23, 5 ' GGAAATCCCTGGCAATGTGAT (SEQ ID N0:27); 
R-24, 5 ' CTGTTCCTGCTGAACTGAGCCA (SEQ ID NO:28); 
R-25, 5' TCAGACCGAGACAAGTGCAATG (SEQ ID NO:29); 
R-26a, 5 ' AGCCTATCCTGGATTACTTGAA (SEQ ID N0:30); 
R-27; 5' AGCGGAACTTAGCCACTGTGAA (SEQ ID N0:31); 
R-28, 5 ' CTCAATAGACTGTGAGCTCCTT (SEQ ID N0:32); 
R-29, 5' AACCGATTTCAGATGGTGCTAG (SEQ ID NO:33); 
R-30, 5' GCTGCAAACATCCGACTGAAAG (SEQ ID N0:34); 
R-31, 5' CAGCTATGCCAGCATCTTGCCT (SEQ ID NO:35); 
R-32, 5' GCAACTTAGTAATGTGCAATA (SEQ ID N0:36); 
R-33, 5' TGCAATGCAACTACAATGCACC (SEQ ID NO:37). 



Fig. 2. Genomic organization of miRNA gene clusters. The precursor 
structure is indicated as box and the location of the miRNA within the 
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precursor is shown in gray; the chromosomal location is also indicated to 
the right. (A) D. melanogaster miRNA gene clusters. (B) Human miRNA 
gene clusters. The cluster of let-7a-l and let-7f-1 is separated by 26500 nt 
from a copy of let-7d on chromosome 9 and 17. A cluster of let-7a-3 and 

!et-7b, separated by 938 nt on chromosome 22, is not illustrated. 

* • • • 

Fig. 3. Predicted precursor structures of D. melanogaster miRNAs. RNA 
secondary structure prediction was performed using mfold version 3.1 [28] 
and manually refined to accommodate G/U wobble base pairs in the helical 
segments. The miRNA sequence is underlined. The actual size of the stem- 
loop structure is not known experimentally and may be slightly shorter or 
longer than represented. Multicopy miRNAs and their corresponding 
precursor structures are also shown. 

Fig, 4. Predicted precursor structures of human miRNAs. For legend, see 
Fig. 3. 

Fig. 5. Expression of novel mouse miRNAs. Northern blot analysis of novel 
mouse miRNAs. Total RNA from different mouse tissues was blotted and 
probed with a 5 '-radiolabeled oligodeoxynucleotide complementary to the 
indicated miRNA. Equal loading of total RNA on the gel was verified by 
ethidium bromide staining prior to transfer; the band representing tRNAs is 
shown. The fold-back precursors are indicated with capital L. Mouse brains 
were dissected into midbrain, mb, cortex, cx, cerebellum, cb. The rest of 
the brain, rb, was also used. Other tissues were heart, ht, lung, Ig, liver, Iv, 
colon, CO, small intestine, si, pancreas, pc, spleen, sp, kidney, kd, skeletal 
muscle, sm, stomach, st, H, human Hela SS3 cells. Oligodeoxynucleotides 
used as Northern probes were: 

miR-1a, CTCCATACTTCTTTACATTCCA {SEQ ID NO:38); 
miR-30b, GCTGAGTGTAGGATGTTTACA (SEQ ID NO:39); 
miR-30a-s, GCTTCCAGTCGAGGATGTTTACA (SEQ ID NO:40); 
miR-99b, CGCAAGGTCGGTTCTACGGGTG (SEQ ID N0:41); 
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miR-1 01 , TCAGTTATCACAGTACTGTA (SEQ ID N0:42); 
miR-122a, ACAAACACCATTGTCACACTCCA (SEQ ID N0:43); 
miR-1 24a, TGGCATTCACCGCGTGCCTTA (SEQ ID N0:44); 
miR-1 25a, CACAGGTTAAAGGGTCTCAGGGA (SEQ ID N0:45); 
miR-1 25b, TCACAAGTTAGGGTCTCAGGGA (SEQ ID N0:46); 
miR-1 27, AGCCAAGCTCAGACGGATCCGA (SEQ ID N0:47); 
miR-128, AAAAGAGACCGGTTCACTCTGA (SEQ ID N0:48); 
miR-1 29, GCAAGCCCAGACCGAAAAAAG (SEQ ID N0:49); 
miR-130, GCCCTTTTAACATTGCACTC{SEQ IDNO:.50); • • . 

miR-1 31, ACTTTCGGTTATCTAGCTTTA (SEQ ID N0:51); • 
miR-132, ACGACCATGGCTGTAGACTGTTA (SEQ ID NO:52); 

miR-1 43, TGAGCTACAGTGCTTCATCTCA (SEQ ID N0:53). 

Fig.6. Potential orthologs of lin-4 stRNA. (A) Sequence alignment of C. 
e/egans Wn-A stRNA with mouse miR-1 25a and miR-1 25b and the D. 
melanogaster miR-1 25. Differences are highlighted by gray boxes. (B) 
Northern blot of total RNA isolated from staged populations of D. 
melanogaster, probed for miR-1 25. E, embryo; L, larval stage; P, pupae; A, 
adult; S2, Schneider-2 cells. 

Fig. 7. Predicted precursor structures of miRNAs, sequence accession 
numbers and homology information. RNA secondary structure prediction 
was performed using mfold version 3.1 and manually refined to 
accommodate G/U wobble base pairs in the helical segments. Dashes were 
inserted into the secondary structure presentation when asymmetrically 
bulged nucleotides had to be accommodated. The excised miRNA 
sequence is underlined. The actual size of the stem-loop structure is not 
known experimentally and may be slightly shorter or longer than 
represented. Multicopy miRNAs and their corresponding precursor 
structures are also shown. In cases where no mouse precursors were yet 
deposited in the database, the human orthologs are indicated. miRNAs 
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which correspond to D. melanogaster or human sequences are included. 
Published C. e/egans miRNAs [36, 37] are also included in the table. A 
recent set of new HeLa cell miRNAs is also indicated [46], If several ESTs 
were retrieved for one organism in the database, only those with different 
precursor sequences are listed^ miRNA homdiogs found in other species are 
indicated. Chromosomal location and sequence accession numbers, and 
clusters of miRNA genes are indicated: Sequences from cloned miRNAs 
were searched against mouse and human in GenBank (including trace 
data), and against Fugu rubripes and Danio redo at www.jgi.doe.gov and 
www.sanger.ac.uk, respectively. 

EXAMPLE 1 : MicroRNAs from D. melanogaster and human. 

We previously developed a directional cloning procedure to isolate siRNAs 
after processing of long dsRNAs in Drosophila melanogaster embryo lysate 
(8), Briefly, 5' and 3' adapter molecules were ligated to the ends of a 
size-fractionated RNA population, followed by reverse transcription, .PGR 
amplification, concatamerization, cloning and sequencing. This method, 
originally intended to isolate siRNAs, led to the simultaneous identification 
of 14 novel 20- to 23-nt short RNAs which are encoded in the D. 
melanogaster genome and which are expressed in 0 to 2 h embryos (Table 
1). The method was adapted to clone RNAs in a similar size range from 
HeLa cell total RNA (14), which led to the identification of 1 9 novel human 
stRNAs (Table 2), thus providing further evidence for the existence of a 
large class of small RNAs with potential regulatory roles. According to their 
small size, we refer to these novel RNAs as microRNAs or miRNAs. The 
miRNAs are abbreviated as miR-1 to miR-33, and the genes encoding 
miRNAs are named mir-1 to mir-33. Highly homologous miRNAs are 
classified by adding a lowercase letter, followed by a dash and a number 
for designating multiple genomic copies of a mir gene. 
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The expression and size of the cloned, endogenous short RNAs was also 
examined by Northern blotting (Fig. 1, Table 1 and 2). Total RNA isolation 
was performed by acid guanidinium thiocyanate-phenol-chloroform 
extraction [45]. Northern analysis was performed as described [1], except 
that the total RNA was resolved on a 15% denaturing poiyacrylamide gel,, 
transferred onto Hybond-N + membrane {Amersham Pharmacia Biotech), 
and the hybridization and wash steps were performed at 50°C. 
Oligodeoxynucleotides used as Northern probes were 5'-32P- 
phosphorylated, complementary to the miRNA se.quence and 2Q to 25 nt in 
length. 

5S rRNA was detected by ethidium staining of polyacrylamide gels prior to 
transfer. Blots were stripped by boiling in 0.1% aqueous sodium 
dodecylsulfate/O.lx SSC (1 5 mM sodium chloride, 1 .5 mM sodium citrate, 
pH 7.0) for 10 min, and were re-probed up to 4 times until the 21 -nt 
signals became too weak for detection. Finally, blots were probed for 
val-tRNA as size marker. 

For analysis of D. melanogaster RNAs, total RNA was prepared from 
different developmental stages, as well as cultured Schneider-2 (S2) cells, 
which originally derive from 20-24 h D. melanogaster embryos [1 51 (Fig, 1 , 
Table 1). miR-3 to miR-7 are expressed only during embryogenesis and not 
at later developmental stages. The temporal expression of mlR-l, miR-2 
and miR-8 to miR-1 3 was less restricted. These miRNAs were observed at 
all developmental stages though significant variations in the expression 
levels were sometimes observed. Interestingly, miR-1, miR-3 to miR-6, and 
miR-8 to miR-1 1 were completely absent from cultured Schneider-2 (S2) 
cells, which were originally derived from 20-24 h D. melanogaster embryos 
[15], while miR-2, miR-7, miR-1 2, and miR-1 3 were present in S2 cells, 
therefore indicating cell type-specific miRNA expression. miR-1 , miR-8, and 
miR-1 2 expression patterns are similar to those of Iin-4 stRNA in C. 
elegans, as their expression is strongly upregulated in larvae and sustained 
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to adulthood [16]. miR-9 and miR-11 are present at all stages but are 
strongly reduced in the adult which may reflect a maternal contribution 
from germ cells or expression in one sex only. 

The mir-3 to mir-6 genes are clustered tFig. 2A), and mir-6 is present as 
triple repeat with slight variations in the mir-6 precursor sequence but not 
in the miRNA sequence itself: The expression profiles of mrR-3 to miR-6 are 
highly similar (Table 1), which -suggests that a single embryo-specific 
precursor transcript may give rise to the different miRNAs,. or that the 
same enhancer regulates miRNA-specific promoters. Several other fly 
miRNAs are also found in gene clusters (Fig, 2A). 

The expression of HeLa cell miR-15 to miR-33 was examined by Northern 
blotting using HeLa cell total RNA, in addition to total RNA prepared from 
mouse kidneys, adult zebrafish, Xenopus laevis ovary, and D. melanogaster 
S2 cells. (Fig. IB, Table 2). miR-15 and miR-16 are encoded in a gene 
cluster (Fig. 2B) and are detected in mouse kidney, fish, and very weakly 
in frog ovary, which may result from miRNA expression in somatic ovary 
tissue rather than oocytes, mir-17 to mir-20 are also clustered (Fig. 28), 
and are expressed in HeLa cells and fish, but undetectable in mouse kidney 
and frog ovary (Fig. 1, Table 2), and therefore represent a likely case of 
tissue-specific miRNA expression. 

The majority of vertebrate and invertebrate miRNAs identified in this study 
are not related by sequence, but a few exceptions, similar to the highly 
conserved let-7 RNA [6], do exist. Sequence analysis of the D. 
melanogaster miRNAs revealed four such examples of sequence 
conservation between invertebrates and vertebrates- miR-1 homologs are 
encoded in the genomes of C. elegans, C. briggsae, and humans, and are 
found in cDNAs from zebrafish, mouse, cow and human. The expression of 
mir-1 was detected by Northern blotting in total RNA from adult zebrafish 
and C. elegans, but not in total RNA from HeLa cells or mouse kidney 
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(Table 2 and data not shown). Interestingly, while mir-1 and let-7 are 
expressed both in adult flies (Fig. 1A) [6] and are both undetected in S2 
cells, miR-l is, in contrast to let-7, undetectable in HeLa cells. This 
represents another case of tissue-specific expression of a miRNA, and 
indicates that miRNAs may not only play a regulatory role in developmental 
timing, but also in tissue specification. miR-7 homologs were found by 
database searches in mouse and human genomic and expressed sequence 
tag sequences (ESTs). Two mammalian miR-7 variants are predicted by 
sequence analysis in mouse and human, and were detected by Northern . 
blotting in HeLa cells and fish, but not in mouse kidney (Table 2). Similarly, • 
we identified mouse and human miR-9 and miR-10 homologs by database 
searches but only detected mir-1 0 expression in mouse kidney. 

The identification of evolutionary related miRNAs, which have already 
acquired multiple sequence mutations, was not possible by standard 
bioinformatic searches. Direct comparison of the D. melanogaster miRNAs 
with the human miRNAs identified an 11 -nt segment shared between D. 
melanogaster miR-6 and HeLa miR-27, but no further relationships were 
detected. One may speculate that most miRNAs only act on a single target 
and therefore allow for rapid evolution by covariation, and that highly 
conserved miRNAs act on more than one target sequence, and therefore 
have a reduced probability for evolutionary drift by covariation [6]. An 
alternative interpretation is that the sets of miRNAs from D. melanogaster 
and humans are fairly incomplete and that many more miRNAs remain to 
be discovered, which will provide the missing evolutionary links. 

lin-4 and let-7 stRNAs were predicted to be excised from longer transcripts 
that contain approximately 30 base-pair stem-loop structures [1, 6], 
Database searches for newly identified miRNAs revealed that all miRNAs 
are flanked by sequences that have the potential to form stable stem-loop 
structures (Fig. 3 and 4). In many cases, we were able to detect the 
predicted, approximately 70-nt precursors by Northern blotting (Fig. 1). 
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Some miRNA precursor sequences were also identified in mammalian cDNA 
(EST) databases [27], indicating that primary transcripts longer than 70-nt 
stem-loop precursors do also exist. We never cloned a 22-nt RNA 
complementary to any of the newly identified miRNAs, and it is as yet 
unknown how the cellular processing machinery distinguishes between the 
miRNA and its complementary strand. Comparative analysis ■ of the 
precursor stem-loop structures indicates that the loops adjacent to the * 
base-paired miRNA segment can be located on either side of the miRNA 
sequence (Fig. 3 and 4), suggesting that the 5 ' or 3 ' location of the stem- 
closing loop is not the determinant of miRNA excision. It is also unlikely 
that the structure, length or stability of the precursor stem is the critical 
determinant as the base-paired structures are frequently imperfect and 
interspersed by less stable, non-Watson-Crick base pairs such as G/A, U/U, 
C/U, A/A, and G/U wobbles. Therefore, a sequence-specific recognition 
process, is a likely determinant for miRNA excision, perhaps mediated by 
members of the Argonaute (rde-1/agol/piwi) protein family. Two members 
of this family, alg-1 and alg-2, have recently been shown to be critical for 
stRNA processing in C. elegans [13]. Members of the Argonaute protein 
family are also involved in RNAi and PTGS. In D. melanogaster, these 
include argonaute2, a component of the siRNA-endonuclease complex 
(RISC) [17], and its relative aubergine, which is important for silencing of 
repeat genes [18]. In other species, these include rde-1, argonautel, and 
qde-2, in C. elegans [19], Arabidopsis thaliana [20], and Neurospora crassa 
[21], respectively. The Argonaute protein family therefore represents, 
besides the RNase III Dicer [12, 13], another evolutionary link between 
RNAi and miRNA maturation. 

Despite advanced genome projects, computer-assisted detection of genes 
encoding functional RNAs remains problematic [22]. Cloning of expressed, 
short functional RNAs, similar to EST approaches (RNomics), is a powerful 
alternative and probably the most efficient method for identification of such 
novel gene products [23-26]. The number of functional RNAs has been 
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widely underestimated and is expected to grow rapidly because of the 
development of new functional RNA cloning methodologies. 

The challenge for the future is to define the function and the potential 
targets of these novel miRNAs by using bioinformatics ^s well as genetics, • 
and to establish a complete catalogue of time- and tissue-specific 
distribution of the already identified and yet to be uncovered miRNAs. lin-4 
and let-7 stRNAs negatively regulate the expression of proteins encoded by 
mRNAs whose 3 ' untranslated regions contain sites of. complementarity to 
the stRNA [3-5]. 

Thus, a series of 33 novel genes, coding for 19- to 23-nucleotide 
microRNAs (miRNAs), has been cloned from fly embryos and human cells. 
Some of these miRNAs are highly conserved between vertebrates and 
invertebrates and are developmentally or tissue-specifically expressed. Two 
of the characterized human miRNAs may function as tumor suppressors in 
B-cell chronic lymphocytic leukemia. miRNAs are related to a small class of 
previously described 21- and 22-nt RNAs (lin-4 and let-7 RNAs), so-called 
small temporal RNAs (stRNAs), and regulate developmental timing in C. 
elegans and other species. Similar to stRNAs, miRNAs are presumed to 
regulate translation of specific target mRNAs by binding to partially 
complementary sites, which are present in their 3'-untranslated regions. 

Deregulation of miRNA expression may be a cause of human disease, and 
detection of expression of miRNAs may become useful as a diagnostic. 
Regulated expression of miRNAs in cells or tissue devoid of particular 
miRNAs may be useful for tissue engineering, and delivery or transgenic 
expression of miRNAs may be useful for therapeutic intervention. miRNAs 
may also represent valuable drug targets itself. Finally, miRNAs and their 
precursor sequences may be engineered to recognize therapeutic valuable 
targets. 
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EXAMPLE 2: miRNAs from mouse. 

To gain more detailed insights into the distribution and function of miRNAs 
in mammals, we investigated the tissue-specific distribution of miRNAs in 
adult mouse. Cloning of miRNAs from specific tissues was preferred over 
whole organism-based cloning because low-abundance miRNAs that 
normally go undetected by Northern blot analysis are identified cionally. 
Also, in situ hybridization techniques for detecting 21 -nt RNAs have not 
yet been developed. Therefore, 19- to 25-nucleotide RNAs were cloned 
and sequenced from total RNA, which was isolated from 18.5-weeks old 
BL6 mice. Cloning of miRNAs was performed as follows: 0.2 to 1 mg.of 
total RNA was separated on a 15% denaturing polyacrylamide gel and RNA 
of 19- to 25-nt size was recovered. A 5 '-phosphorylated 3 '-adapter 
oligonucleotide (5 '-pUUUaaccgcgaattccagx: uppercase, RNA; lowercase, 
DNA; p, phosphate; x, 3'-Amino-Modifier C-7, ChemGenes, Ashland, Ma, 
USA, Cat. No. NSS-1004; SEQ ID NO:54) and a 5 '-adapter oligonucleotide 
(5 '-acggaattcctcactAAA: uppercase, RNA; lowercase, DNA; SEQ ID 
NO:55) were ligated to the short RNAs. RT/PCR was performed with 3'- 
primer (5 '-GACTAGCTGGAATTCGCGGTTAAA; SEQ ID NO:56) and 5'- 
primer (5 '-CAGCCAACGGAATTCCTCACTAAA; SEQ ID NO:57). In order 
to introduce Ban I restriction sites, a second PCR was performed using the 
primer pair 5 '-CAGCCAACAGGCACCGAATTCCTCACTAAA (SEQ ID 
NO:57) and 5 '-GACTAGCTTGGTGCCGAATTCGCGGTTAAA (SEQ ID 
NO:56), followed by concatamerization after Ban I digestion and T4 DNA 
ligation. Concatamers of 400 to 600 basepairs were cut out from 1.5% 
agarose gels and recovered by Biotrap (Schleicher & Schuell) electroelution 
(1 X TAE buffer) and by ethanol precipitation. Subsequently, the 3 ' ends of 
the concatamers were filled in by incubating for 15 min at 72°C with Taq 
polymerase in standard PCR reaction mixture. This solution was diluted 3- 
fold with water and directly used for ligation into pCR2.1 TOPO vectors. 
Clones were screened for inserts by PCR and 30 to 50 samples were 
subjected to sequencing. Because RNA was prepared from combining 
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tissues of several mice, minor sequence variations that were detected 
multiple times in multiple clones may reflect polymorphisms rather than 
RT/PCR mutations. Public database searching was used to identify the 
genomic sequences encoding the .approx. 21 -nt RNAs. The occurrence of 
a 20 to 30 basepair fold-back structure involving the immediate upstream 
or downstream flanking sequences was used to assign miRNAs [36-38], 

We examined 9 different mouse tissues and identified 34 novel miRNAs, 
some of which are highly tissue-specifically expressed (Table 3 and Figure 
5). Furthermore, we identified 33 new miRNAs from different mouse 
tissues and also from human Soas-2 osteosarcoma celts (Table 4). miR-1 
was previously shown by Northern analysis to be strongly expressed in 
adult heart, but not in brain, liver, kidney, lung or colon [37]. Here we 
show that miR-1 accounts for 45% of all mouse miRNAs found in heart, 
yet miR-1 was still expressed at a low level in liver and midbrain even 
though it remained undetectable by Northern analysis. Three copies or 
polymorphic alleles of miR-1 were found in mice. The conservation of 
tissue-specific miR-1 expression between mouse and human provides 
additional evidence for a conserved regulatory role of this miRNA. In liver, 
variants of miR-1 22 account for 72% of all cloned miRNAs and miR-1 22 
was undetected in all other tissues analyzed. In spleen, miR-1 43 appeared 
to be most abundant, at a frequency of approx. 30%. In colon, 
miR-1 42-as, was cloned several times and also appeared at a frequency of 
30%. In small intestine, too few miRNA sequences were obtained to permit 
statistical analysis. This was due to strong RNase activity in this tissue, 
which caused significant breakdown of abundant non-coding RNAs, e.g. 
rRNA, so that the fraction of miRNA in the cloned sequences was very 
low. For the same reason, no miRNA sequences were obtained from 
pancreas. 

To gain insights in neural tissue miRNA distribution, we analyzed cortex, 
cerebellum and midbrain. Similar to heart, liver and small intestine, variants 
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of a particular miRNA, miR-124, dominated and accounted for 25 to 48% 
of all brain miRNAs. miR-101, -127, -128, -131, and -132, also cloned 
fronn brain tissues, were further analyzed by Northern blotting and shown 
to be predominantly brain-specific. Northern blot analysis was performed as 
described in Example 1 . tRNAs and 5S rRNA were detected by ethidium 
staining of polyacrylamide gels prior to transfer to verify equal loading. 
Blots were stripped by boiling in deionized water for 5 min, and reprobed 
up to 4 times until the 21 -nt signals became too weak for detection. 

miR-125a and miR-125b are very similar to the sequence of C, elegans 
lin-4 stRNA and may represent its orthologs {Fig. 6A). This is of great 
interest because, unlike let-7 that was readily detected in other species, 
lin-4 has acquired a few mutations in the central region and thus escaped 
bioinformatic database searches. Using the mouse sequence miR-1 25b, we 
could readily identify its ortholog in the D- melanogaster genome. 
miR-1 25a and miR-1 25b differ only by a central diuridine insertion and a U 
to C change. mjR-125b is very similar to lin-4 stRNA with the differences 
located only in the central region, which is presumed to be bulged out 
during target mRNA recognition [41 ]. miR-1 25a and miR-1 25b were cloned 
from brain tissue, but expression was also detected by Northern analysis in 
other tissues, consistent with the role for lin-4 in reguJating neuronal 
remodeling by controlling lin-14 expression [43]. Unfortunately, orthologs 
to C. elegans lin-14 have not been described and miR-1 25 targets remain 
to be identified in D. melanogaster or mammals. Finally, miR-1 25b 
expression is also deveiopmentally regulated and only detectable in pupae 
and adult but not in embryo or larvae of D. melanogaster (Fig. 6B), 

Sequence comparison of mouse miRNAs with previously described miRNA 
reveals that miR-99b and miR-99a are similar to D. melanogaster, mouse 
and human miR-1 0 as well as C. elegans miR-51 [36], miR-1 41 is similar to 
D. melanogaster miR-8 , miR-29b is similar to C. elegans miR-83 , and 
miR-1 31 and miR-1 42-s are similar to D. melanogaster miR-4 and C. 
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elegans miR-79 [36]. miR-124a is conserved between invertebrates and 
vertebrates. In this respect it should be noted that for almost every mlRNA 
cloned from mouse was also encoded in the human genome, and 
frequently detected in other vertebrates, such as the pufferfish, Fugu 
rubr/pes, and the zebrafish, Dan/o rerio. Sequence conservation may point 
to conservation in function of these mlRNAs. Comprehensive information 
about orthologous sequences is listed in Fig. 7. 

In two cases both strands of miRNA precursors were cloned (Table 3), 
which was previously observed once for a C. elegans miRNA [36]. It is 
thought that the most frequently cloned strand of a miRNA precursor 
represents the functional miRNA, which is miR-30c-s and miR-142-as, s 
and as indicating the 5 ' or 3 ' side of the fold-back structure, respectively. 

The mir-1 42 gene is located on chromosome 1 7, but was also found at the 
breakpoint junction of a t(8;l 7) translocation, which causes an aggressive 
B-cell leukemia due to strong up-regulation of a translocated MYC gene 
[44], The translocated MYC gene, which was also truncated at the first 
exon, was located only 4-nt downstream of the 3 '-end of the miR-142 
precursor. This suggests that translocated MYC was under the control of 
the upstream miR-142 promoter. Alignment of mouse and human miR-142 
containing EST sequences indicate an approximately 20 nt conserved 
sequence element downstream of the mir-142 hairpin. This element was 
lost in the translocation. It is conceivable that the absence of the 
conserved downstream sequence element in the putative miR-142/mRNA 
fusion prevented the recognition of the transcript as a miRNA precursor 
and therefore may have caused accumulation of fusion transcripts and 
overexpression of MYC. 

miR-155, which was cloned from colon, is excised from the known 
noncoding BIC RNA [47]. BIC was originally identified as a gene 
transcriptionaHy activated by promoter insertion at a common retroviral 
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integration site in B cell lymphomas induced by avian leukosis virus. 
Comparison of BIC cDNAs from human, mouse and chicken revealed 78% 
identity over 1 38 nucleotides [47]. The identity region covers the miR-1 55 
fold-back precursor and a few conserved boxes downstream of the 
fold-back sequence. The relatively high level of expression of BIC in 
lymphoid organs and cells in human, mouse and chicken implies an 
evolutionary conserved function, but BIC RNA has also been detected at 
low levels in non-hematopoietic tissues [47]. 

Another interesting observation was that segments of perfect 
complementarity to miRNAs are not observed in mRNA sequences or in 
genomic sequences outside the mlRNA inverted repeat. Although this could 
be fortuitous, based on the link between RNAi and miRNA processing [11, 
1 3, 43] it may be speculated that miRNAs retain the potential to cleave 
perfectly complementary target RNAs. Because translational control 
without target degradation could provide more flexibility it may be 
preferred over mRNA degradation. 

In summary, 63 novel miRNAs were identified from mouse and 4 novel 
miRNAs were identified from human Soas-2 osteosarcoma cells (Table 3 
and Table 4), which are conserved in human and often also in other 
non-mammalian vertebrates. A few of these miRNAs appear to be 
extremely tissue-specific, suggesting a critical role for some miRNAs in 
tissue-specification and cell lineage decisions. We may have also identified 
the fruitfly and mammalian ortholog of C. elegans lin-4 stRNA. The 
establishment of a comprehensive list of miRNA sequences will be 
instrumental for bioinformatic approaches that make use of completed 
genomes and the power of phylogenetic comparison in order to identify 
miRNA-regulated target mRNAs. 
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Tab!e 1 

D. melanogaster miRNAs. The sequences given represent the most 
abundant, and typically longest miRNA sequence identified by cloning; 
nniRNAs frequently vary in length by one or two nucleotides at their 3 ' 

5 termini. From 222 short RNAs sequenced, 69 (31%) corresponded to 
miRNAs, 103 (46%) to already characterized functional RNAs (rRNA, 7SL 
RNA, tRNAs), 30 (14%) to transposon RNA fragments, and 20 (10%) 
sequences with no database' entry. The frequency (freq.) for cloning a 
particular mlRNA relative to all identified miRNAs is indicated in percent- 

0 Results of Northern blotting of total RNA isolated f rom staged populations 
of D. melanogaster are summarized. E, embryo; L; larval stage; P, -pupae; 
A, adult; S2, Schneider-2 cells. The strength of the signal within each blot 
is represented from strongest (+ -h -h) to undetected (-). let-7 stRNA was 
probed as control. Genbank accession numbers and honnologs of miRNAs 

5 identified by database searching in other species are provided as 
supplementary material. 



miRNA 


sequence (5' to 3') 


freq. 
(%) 


E 

0-3 h 


E 

0-6 h 


L1 + 
L2 


L3 


P 


A 


S2 


miR-1 


UGGAAUGUAAAGAAGUAUGGAG 
(SEQ ID NO:58) 


32 


+ 


+ 

* 


++ 
+ 


++ 
+ 


++ 


++ 
+ 




miR-2a* 


UAUCACAGCCAGCUUUGAUGAGC 
(SEQ ID NO: 59) 


• 3 
















miR-2b* 


UAUCACAGCCAGCUUUGAGGAGC 
(SEQ ID NO: 60) 


3 


++ 


++ 


++ 


++ 
+ 


++ 


+ 


++ 

+ 


miR-3 


UCACUGGGCAAAGUGUGUCUCA# 


9 


+++ 


+++ 












miR-4 


AUAAAGCUAGACAACCAUUGA 
(SEQ ID NO: 62) 


6 


+++ 


+++ 












miR.5 


AAAGGAACGAUCGUUGUGAUAUG 
(SEQ ID NO: 63) 


1 


+++ 


+++ 


+A 










miR-6 


UAUCACAGUGGCUGUUCUUUUU 
(SEQ ID NO: 64) 


13 


+++ 


+++ 


+/- 


+/- 








miR-7 


UGGAAGACUAGUGAUUUUGUUGU 
(SEQ ID NO: 65) 


4 


+++ 


++ 




+/- 


+A 






miR-8 


UAAUACUGUCAGGUAAAGAUGUC 
(SEQ ID NO: 66) 


3 




+/- 


++ 
+ 


++ 

+ 


+ 


++ 
+ 
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miR-9 


UCUUUGGUUAUCUAGCUGUAUGA 
(SEQ ID NO: 67) 


7 


+++ 


++ 


++ 
+ 


++ 
+ 


+ 






miR-10 


ACCCUGUAGAUCCGAAUUUGU 
(SEQ ID NO: 68) 


1 


+ 


+ 


++ 


++ 
+ 




+ 




miR-11 


CAUCACAGUCUGAGUIICUUGC 

» * * • 4 

(SEQ ID NO: 69) 


7 

* 


+++ 

t 


+++ 


++ 
+ 




++ 

+ 


+ 


• 


miR-12 


UGAGUAUUACAUCAGGUACUGGU 
(SEQ ID NO: 70) 


7 


+ 


+ 






+ 


+ 




miR-13a* 


UAUCACAGCCAUUUUGACGAGU 
(SEQ ID NO: 71) 


1 


+++ 


+++ 


++ 
+ 

■ 


++ 
+ 


+ 

* 


++ . 
+ 


++ 
+ 


miR-13b* 


UAUCACAGCCAUUWGAUGAGU 
(SEQ ID NO: 72) 


0 


• 














miR-14 


UCAGUCUUUUUCUCUCUCCUA 
(SEQ ID NO: 73) 


1 - 
















let-7 


UGAGGUAGUAGGUUGUAUAGUU 
(SEQ ID NO: 74) 


0 










++ 
+ 


++ 
+ 





10 #= (SEQ ID N0:61) 

^Similar miRNA sequences are difficult to distinguish by Northern 
blotting because of potential cross-hybridization of probes. 
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Table 2 

Human miRNAs. From 220 short RNAs sequenced, 100 (45%) 
corresponded to miRNAs, 53 (24%) to already characterized functional 
RNAs (rRNA, snRNAs, tRNAs), and 67 (30%) sequences with no database 
-5 entry. Results of Northern blotting of. total RNA isolated from different 
vertebrate species and S2 cells are indicated. For legend, see Table 1. 



miRNA 
Iel-7a* 


sequence (5' to 3') 

UGAGGUAGUAGGUUGUAUAGUU# 


freq. 

(%) . 

10 . 


HeLa 

ceJIs 
+++ 


. mouse 

. kidney 
+++ 


adult • 

fish 
+++ 


frog 
ovary 


S2 ; 

• 


let-7b* 


UGAGGUAGUAGGUUGUGUGGUU 
■ (SEQ ID N0:76) / ' 


■-• 13 

ft 




• • 
• * • 








Iet-7c* 


UGAGGUAGUAGGUUGUAUGGUU 
(SEQ ID NO: 77) 


3 










• 


let-7d* 


AGAGGUAGUAGGUUGCAUAGU 
(SEQ ID N0:78) 


2 


+++ 


+++ 


+++ 


- 


- 


let-7e* 


UGAGGUAGGAGGUUGUAUAGU 
(SEQ ID NO: 79) 


2 


+++ 


+++ 


+++ 


- 


- 


let-7r 


UGAGGUAGUAGAUUGUAUAGUU 
(SEQ IDNO:80) 


1 












miR-15 


UAGCAGCACAUAAUGGUXJUGUG 
(SEQ ID NO: 81) 


3 


+++ 


++ 


+ 


+/- 


- 


miR-16 


UAGCAGCACGUAAAUAUUGGCG 
(SEQ ID NO; 82) 


10 


+++ 


+ 




+/- 


- 


miR-17 


ACUGCAGUGAAGGCACUUGU 
(SEQ ID NO: 83) 


1 


+++ 










miR-18 


UAAGGUGCAUCUAGUGCAGAUA 
(SEQ ID N0:84) 


2 


+++ 










miR-19a* 


UGUGCAAAUCUAUGCAAAACUGA 
(SEQ ID NO: 85) 


1 


+++ 










miR-19b* 


UGUGCAAAUCCAUGCAAAACUGA 
(SEQ ID NO: 86) 


3 












miR-20 


UAAAGUGCUUAUAGUGCAGGUA 
(SEQ ID NO: 87) 


4 


+++ 




+ 






miR-21 


UAGCUUAUCAGACUGAUGUUGA 
(SEQ ID N0:88) 


10 


+++ 


+ 


++ 






miR-22 


AAGCUGCCAGUUGAAGAACUGU 
(SEQ ID N0:89) 


io 


+++ 


+++ 


+ 


+/- 




miR-23 


AUCACAUUGCCAGGGAUUUCC 
(SEQ ID NO: 90) 


2 


+++ 


+++ 


+++ 


+ 
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miR-24 


UGGCUCAGUUCAGCAGGAACAG 
(SEQ ID NO: 91) 


4 


++ 


+++ 


++ 


- 


- 


miR-25 


CAUUGCACUUGUCUCGGUCUGA 
(SEQ ID NO: 92) 


3 


-*-++ 


+ 


++ 


- 


- 


miR-.26a* 


UUCAAGUAAUCCAGGAUAGGCU 
(SEQ ID NO: 93) 

• 


2 


+ 


++ 


+++ 

• 


- 


- 


miR-26b* 


UUCAAGUAAUUCAGGAUAGGUU 
(SEQ ID NO: 94) 


1 










- 


miR-27 


UUCACAGUGGCUAAGUUCCGCU 
(SEQ ID NO: 95) 


2 


+++ 


+++ 




- 


• 


mik-26 


AAGGAGCUCACAGUCUAUUGAG 
(SEQ ID NO: 96) . . 


2 




+++ 


- 

• 


- 




miR-29 


CUAGCACCAUCUGAAAUCGGUU 
(SEQ ID NO: 97) 


2 


+ 


+++ 

• * 


+/- 


• 




miR-30 


CUUUCAGUCGGAUGUUUGCAGC ' 
(SEQ ID NO: 98) 


2 


- +++ 


+++ 


•+•+.+ • 




• 


miR-31 


GGCAAGAUGCUGGCAUAGCUG 
(SEQ ID NO: 99) 


2 


+++ 


- 


- 




- 


miR-32 


UAUUGCACAUUACUAAGUUGC 
(SEQ ID NO: 100) 


1 


- 


- 




- 


- 


miR-33 

■ 


GUGCAUUGUAGUUGCAUUG 
.(SEQ ID NO: 101) 


1 


- 


- 


- 


- 


- 


miR-1 


UGGAAUGUAAAGAAGUAUGGAG 
(SEQ ID NO: 102) 


0 


- 


- 


.+ 


- 


- 


miR-7 


UGGAAGACUAGUGAUUUUGirUGU 
(SEQ ID NO: 103) 


0 


+ 










miR-9 


UCUUUGGUUAUCUAGCUGUAUGA 
(SEQ ID NO: 104) 


0 












miR-1 0 


ACCCUGUAGAUCCGAAUUUGU 
(SEQ ID NO: 105) 


0 




+ 









# = (SEQ ID NO:75) 



*Similar miRNA sequences are difficult to distinguish by Northern 
20 blotting because of potential cross-hybridization of probes. 
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Table 3 

Mouse miRNAs. The sequences indicated represent the longest miRNA 
sequences identified by cloning. The 3 '-terminus of miRNAs is often 
truncated by one or two nucleotides. miRNAs that are more than 85% 
Identical in sequence (i.e. share 18 out of 21 nucleotides) or contain 1- or 
2-nucleotide internal deletions are referred to by the same gene number 
followed by a lowercase letter. Minor sequence variations between related 
miRNAs are generally found near the ends of the miRNA sequence and are 
thought to not compromise target RNA recognition. Minor sequence 
variations may also represent A to G and C* to U changes, which are 
accommodated as G-U wobble base pairs during, target recognition. 
miRNAs with the suffix -s or -as indicate RNAs derived from either the 5 
half or the 3 '-half of a miRNA precursor. Mouse brains were dissected into 
midbrain, mb, cortex, cx, cerebellum, cb. The tissues analyzed were heart, 
ht; liver, Iv; small intestine, si; colon, co; cortex, ct; cerebellum, cb; 
midbrain, mb. 



miRNA sequence (5" to 3^) Number of clones 

ht Iv sp si CO cx cb mb 



let-7a 
let-7b 
let-7c 
let-7d 
let-7e 
let-7f 
let-7g 
let-7h 



UGAGGUAGUAGGUUGUAUAGUU 
(SEQ ID N0:106) 

UGAGGUAGUAGGUUGUGUGGUU 
(SEQ ID NO: 107) 

UGAGGUAGUAGGUUGUAUGGUU 
(SEQ ID N0:108} 

AGAGGUAGUAGGUUGCAUAGU 
(SEQ ID NO: 109) 

UGAGGUAGGAGGUUGUAUAGU 
(SEQ ID NO: 110) 

UGAGGUAGUAGAUUGUAUAGUU 
(SEQ ID NO: 111) 

UGAGGUAGUAGUUUGUACAGUA 
(SEQ ID NO: 112) 

UGAGGUAGUAGUGUGUACAGUU 
(SEQ ID NO: 113) 



3 

1 1 
2 

2 

1 

* 

2 



1 1 7 

2 5 
2 5 19 

2 2 2 

2 

3 3 
1 1 2 
1 1 
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let-Vi 



miR-lb 



miR-lc 



miR-ld 



miR-9 



UGAGGUAGUAGUUUGUGCU 
(SBQ ID N0:114) 

UGGAAUGUAAAGAAGUAUGUAA 4 
(SEQ ID NO: 115) 

UGGAADGUAAAGAAGUAUGUAC 7 
(SEQ ID N0:116) 

UGGAAUGUAAAGAAGUAUGUAUU 16 
(SEQ ID NO: 117) 

UCUUUGGUUAUCUAGCUGUAUGA 
(SEQ ID N0:118) 



1 



1 



1 



1 



miR-15a 



miR-15b 



miR-16 



iniR-18 



10 iniR-19b 



miR-20 



miR-21 



iniR-22 



miR-23a 



1 5 iniR-23b 



miR-24 



miR-26a 



miR-26b 



iniR-27a 



20 miR-27b 



TiiiR-29a 



miR-29b/miR-102 



miR-29c/ 



UAGCAGCACAUAAUGGtRJUGUG 1 
(SEQ ID NO: 119) 

UAGCAGCACAUCAUGGUtKJACA 1 
(SEQ. ID NO:120) 

UAGCAGCACGUAAAyAUUGGCG 1 
(SEQ ID NO': 1^1) 

UAAGGaGCAUCUAGUGCAGAUA 
(SEQ ID NO: 122) 

UGUGCAAAUCCAUGCAAAACUGA 
(SEQ ID NO:123) 

UAAAGUGCUUAUAGUGCAGGUAG 
(SEQ ID NO:124) 

UAGCUUAUCAGACUGAUGUUGA 1 
(SEQ ID NO:125) 

AAGCUGCCAGUUGAAGAACUGU 2 
(SEQ ID NO: 126) 

AUCACAUUGCCAGGGAUUUCC 1 
(SEQ ID NO; 127) 

AUCACAUUGCCAGGGAUUACCAC 
(SEQ ID NO: 128) 

UGGCUCAGUUCAGCAGGAACAG 1 
(SEQ ID NO: 129) 

UUCAAGUAAUCCAGGAUAGGCU 
(SEQ ID NO:130) 

UUCAAGUAAUUCAGGAUAGGUU 
(SEQ ID NO: 131) 

UUCACAGUGGCUAAGUUCCGCU 1 
(SEQ ID NO: 132) 

XJUCACAGUGGCUAAGUUCUG 
(SEQ ID NO:133) 

CUAGCACCAUCUGAAAUCGGUU 1 
(SEQ ID NO:134) 

UAGCACCAUUUGAAAUCAGUGUU 1 
(SEQ ID NO: 135) 

UAGCACCAUUUGAAAUCGGUUA 1 
(SEQ ID NO:136) 



1 



1 



1 



•1 



1 



1 



1 



1 



1 



1 



1 



1 



1 



1 



1 



1 



1 



1 



1 



1 



1 
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miR-30a-s/miR-97 uguaaacauccucgacuggaagc i i 

(SEQ ID NO: 137) ^ 

miR-30a-as* cuuucagucggauguuugcagc , 

(SEQ ID NO: 138) ^ 

miR-30b uguaaacauccuacacucagc i ^ 

(SEQ ID NO:"l39) ^ 

TniR-3 Oc UGUAAACAUCCUACACUCUCAGC 2 1 1 

(SEQ ID NO: 140} 

miR-3 Od uguaaacauccccgacuggaag i 

(SEQ ID NO: 141) 

miR-99a/niiR-99 acccguagauccgaucuugu i 

(SEQ ID NO:142) 

iniR-99b cacccguagaaccgaccuugcg i 

(SEQ ID NO: 143) 

miR- 101 uacaguacugugauaacuga 2 1 

(SEQ ID NO 1144) 

miR- 1 22a uggagugugacaaugguguuugu 3 

(SEQ ID NO: 145) 

miR- 1 22b UGGAGUGaGACAAUGGUGUUUGA 1 1 

(SEQ ID NO: 146) 

iniR"122a,b uggagugugacaaugguguuug 23 

(SEQ ID NO: 147) 

miR. 1 23 • CAUUAUUAClTaUUGGUACGCG 1 2 

(SEQ ID NO:148) 

iniR-124a** uuaaggcacgcgg-ugaaugcca i 37 a\ 

(SEQ ID NO: 149} 

miR- 124b uuaaggcacgcgggugaaugc 1 3 

(SEQ ID NO: 150} 

miR-125a . ucccuGAGACCCUtruAACCUGUG 1 1 

(SEQ ID NOtlSl) 

miR-1 25b ucccugagacccu- -aacuuguga 1 

(SEQ ID NO: 152) 

miR-1 26 ucguaccgugaguaauaaugc 4 1 

(SEQ ID NO: 153) 

miR- 127 ucggauccgucugagcuuggcu 1 

(SEQ ID NO:154) 

miR-128 ucacagugaaccggucucuuuu 2 2 

(SEQ ID N0:155) 

miR-1 29 cuuuuuucggucugggcuugc \ 

(SEQ ID NO: 156) 

miR- 130 cagugcaauguuaaaagggc \ 

(SEQ ID NO: 157) 

miR- 1 3 1 uaaagcuagauaaccgaaagu j \ 

(SEQ ID NO:158} 

miR- 1 32 uaacagucuacagccauggucgu \ 

(SEQ ID NO:159) 
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niiR-133 



miR-134 



miR-135 



miR-136 



iniR-137 



iniR-138 



miR-139 



miR-140 



miR-141 



10 miR-142-s 



miR-142-as^ 



miR-143 



miR-144 



iniR-145 



15 miR-146 



iniR-147 



iniR-148 



nuR-149 



miR-150 



20 iniR-151 



miR-152 



niiR-153 



TniR-154 



UUGGUCCCCUUCAACCAGCUGU 
(SEQ ID NO: 160) 

UGUGACUGGUUGACCAGAGGGA 
(SEQ ID NOtlSl) 

UAUGGCUUUUUAUUCCUAUGaGAA 
(SEQ ID NO:162) ■ 

ACUCCAUOaGUUUUGAUGAUGGA 
(SEQ ID NO: 163) 

UAUUGOJUAAGAAUACGCGUAG 
(SEQ ID N0:164) 

AGCUGGUGUUGUGAAUC 
(SEQ ID N0:165) 

UCUACAGUGCACGUGUCU 
(SEQ ID NO:166) 

AGUGGUUUUACCCUAUGGUAG 
(SEQ ID NO:167) 

AACACUGUCaGGUAAAGAUGG 
(SEQ ID NO:168) 

CAUAAAGUAGAAAGCACUAC 
(SEQ ID NO: 169) 

UGUAGUGUUUCCUACUUUAUGG 
(SEQ ID NO:170) 

UGAGAUGAAGCACUGUAGCUCA 
(SEQ ID NO: 171) 

UACAGUAUAGAUGAUGUACUAG 
(SEQ ID NO: 172) 

GUCCAGUUUUCCCAGGAAUCCCUU 
(SEQ ID NO: 173) 

UGAGAACUGAAUUCCAUGGGUUU 
(SEQ ID N0:174) 

GUGDGUGGAAAUGCUUCUGCC 
(SEQ ID NO: 175) 

UCAGUGCACUACAGAACUUUGU 
(SEQ ID N0:176) 

UCUGGCUCCX3UGUCUUCACUCC 
(SEQ ID N0:177) 

UCUCCCAACCCUUGUACCAGUGU 
(SEQ ID NO: 178) 

CUAGACUGAGGCUCCtJUGAGGU 
(SEQ ID NO: 179) 

UCAGUGCAUGACAGAACUUGG 
(SEQ ID N0:180) 

UUGCAUAGUCACAAAAGUGA 
(SEQ ID N0:181) 

UAGGUUAUCCGUGUUGCCUUCG 
(SEQ ID NO. 182) 



1 



1 



1 



1 



1 



1 



1 



1 



1 



1 



1 



1 



1 



1 



1 



1 



1 



1 



1 



1 
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nuR-155 UUAAUGCUAAUUGUGAUAGGGG ] 

(SEQ ID NO: 183) 



Tie originally described iniR-30 was renamed to nuR-30a-as in order to distinguish 
it from the miKNA derived from the opposite strand of the precursor encoded by the 
mir-SOa gene. niiR-30a-s is equivalent to niiR-97 [46], " 

* M M 

^'A 1-nt length heterogeneity is found on both 5' and 3' end. The 22-nt irdR sequence 
is shown, but only 2 1-nt miRNAs were cloned. 



wo 03/029459 PCT/EP02/10881 

- 35 - 

Table 4 



Mouse and human miRNAs. The sequences indicated represent the longest 
miRNA sequences identified by cloning. The 3' terminus of miRNAs is often 
truncated by one or two nucleotides. miRNAs that are more than 85% identical 
in sequence (i.e. share 18 out of 21 nucleotides) or contain 1- or 2-nucleotide 
internal deletions are referred to by the same gene number followed by a 
lowercase letter. Minor sequence variations between related miRNAs are 
generally found near the ends of the miRNA sequence and are thought to not-, 
compromise target RNA recognition. Minor sequence variations may also 
represent A to G and C to U changes; which are accommodated as G-U wobble 
base pairs during target recognition. Mouse brains were dissected into 
midbrain, mb, cortex, cx, cerebellum, cb. The tissues analyzed were lung. In; 
liver, Iv; spleen, sp; kidney, kd; skin, sk; testis, ts; ovary, ov; thymus, thy; eye, 
ey; cortex, ct; cerebellum, cb; midbrain, mb. The human osteosarcoma cells 
SAOS-2 cells contained an inducible p53 gene {p53-, uninduced p53; p53 + , 
induced p53); the differences in miRNAs identified from induced and uninduced 
SAOS cells were not statistically significant. 
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10 



Table 5 



D. me/anogasterrmRNA sequences and genomic location. The sequences given 
represent the most abundant, and typically longest miRNA sequences identified 
by cloning. It was frequently observed that miRNAs vary in length by one or 
two nucleotides at their 3 '-terminus. From 222 short RNAs sequenced; 69 
{31 %) corresponded to miRNAs, 1 03 (46%) to already characterized functional 
RNAs {rRNA, 7SL RNA, tRNAs), 30 (14%) to transposon RNA fragments, and 
20 (10%) sequences with no database entry. RNA sequences with a 5'- 
guanosine are likely to be underrepresented due to the cloning procedure (8). 
miRNA homologs found in other species are indicated. Chromosomal location 
(chr.) and GenBank accession numbers (acc. nb.) are indicated. No ESTs 
matching miR-1 to miR-14 were detectable by database searching. 



15 



miRNA sequence (5' to 3') 



chr., acc. nb. remarks 



miR-1 



ITGGAAUGUAAAGAAGUAUGGAG 2L, AE003667 
(SEQ ID NO: 58) 



homologs: C. briggsae, G20U, 
AC87074; C.elegans G20U, 
U97405; mouse, G20U, G22U, 
AC020867; human, chr. 20, 
G20U, G22U, AL449263; ESTs: 
zebrafish, G20U, G22U, BF157- 
601; cow, G20U, G22U, BE722. 
224; human, G20U, G22U, 
AI220268 



miR-2a 



XJATTCACAGCCAGCnnuGAXTGAGC 2L, AE003663 2 precursor variants clustered 
(SEQ ID NO: 59) with 3 copy of mir-lb 



20 miR-2b 



DAUCACAGCCAGCtnTUGAGGAGC 2L, AE003620 2 precursor variants 
(SEQ ID NO: 60) 2L, AE003663 



25 



miR-3 



miR-4 



uCACUGGGCAAAGUGUGUCircA 2R, AE003795 In cluster /n/r-3 to /n/r-6 

(SEQ ID NO: 61) 

AUAAAGCUAGACAACCADUGA 2R, AE003795 In cluster mir'.3 to m/r-6 

(SEQ ID NO: 62) 
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miR-5 AAAGGAACGAUCGUUGaGAnAXJG 2R, AE003795 

{SEQ ID NO: 63) 

mtR-6 TIAUCACAGUGGCUGU UCU U U U U 2R, AE003795 

{SEQ ID NO: 64) 



miR-7 UGGAAGACUAGlTGATnJUTJGUUGU 2R, AE003791 

(SEQ ID NO: 65) 



10 



miR-8 



miR-9 



UAATIACUGUCAGGUAAAGAUGUC 
(SEQ ID NO: 66) 

UCTJDTTGGUTJAUCTAGCUGDArrGA 
(SEQ ID NO: 67) 



2R, AE003805 



3L, AE003516 



miR-10 



ACCCUGDAGATJCCGAAUUUGtJ 
(SEQ ID NO:68) 



AE001574 



m!R-11 



15 miR-12 



miR-13a 



CAUCACAGtrCUGAGUUCUUGC 
(SEQ ID NO: 69) 

UGAGUATOACATJCAGGUACUGGXJ 
(SEQ ID NO: 70) 

UATJCACAGCCAtrOUnGACGAGa 
(SEQ ID NO: 71) 



3R, AE003735 

X, AE003499 

3R, AE003708 
X, AE003446 



20 



miR-13b UAUCACAGCCAUUUaGAUGAGU 3R, AE003708 

(SEQ ID NO: 72) 



in cluster /n/r-3 to m/r-fi 

in cluster m/r-3 to /n/r-6 with 3 
variants 

homologs: human, chr. 19 
AC006537, EST BF373391; 
mouse chr. 17 AC026385, EST 
AA881786 



homologs: mouse, chr. 19, 
AF155142; human, chr. 5, 
AC026701, chr. 15, AC005316 

homologs: mouse, chr 11, 
AC011194; human, chr. 17, 
AF287967 

intronic location 

intronic location 

miM3a clustered with mir-ISb 
' on chr. 3R 

mir-13a clustered with mlMSb 
on chr. 3R 



miR-14 



UCAGU C U U U U U CnCUCXTC CUA 2R, AE003833 

(SEQ ID NO:73) 



no signal by Northern analysis 
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Table 6 

Human miRNA sequences and genomic location. From 220 short RNAs 
sequenced, TOO (45%) corresponded to mlRNAs, 53 (24%) to already 
characterized functional RNAs (rRNA, snRNAs, tRNAs), and 67 (30%) 
5 sequences with no database entry. For legend, see Table 1. 

miRNA sequence (5' to 3 ) chr. or EST, remarks* 

acc. nb. 



Iet-7a 



10 



UGAGGUAGUAGGUUGUAUAGUa 
(SEQ ID NO: 75) 



9, AC007924, 
11, AP001359, 
17, AC087784, 
22, AL049853 



sequences of chr 9 and 17 
identical and clustered with /ef-7f, 
homotogs: C, elegans, AF274345; 
C. bnggsae, AF210771, D. 
melanogaster^ AE003659 



let-7b 



UGAGGUAGTTAGGUUGUGTTGGtJO 22, AL049853t, homologs: mouse, EST AI481799; 



(SEQ ID NO: 76) 



ESTs, AI382133, rat, EST, BE1 20662 
AW028822 



let-7c UGAGGUAGXTAGG U u G UAXJGGUU 21, AP001667 Homoiogs: mouse, EST, 

(SEQ ID NO: 77) AA575575 



15 let-7d AGAGGUAGTTAGGUUGCAUAGtT 17, AC087784, identical precursor sequences 

( SEQ ID NO : 7 8 ) 9^ AC007924 



20 



let-7e 



UGAGGUAGGAGGUDGUAtTAGU 1 9, AC01 8755 
(SEQ ID NO: 79) 



let-7f UGAGGDAGUAGAUTJGUAtTAGUU 9, AC007924, 



(SEQ ID NO: 80) 



17, AC087784, 
X, AL592046 



sequences of chr 9 and 17 
identical and clustered with let-Ja 



miR-15 UAGGAGCACAUAAUGGtJinJGirG 13, AC069475 in cluster with miM6 homolog 

(SEQ ID NO: 81) 

miR-16 UAGCAGCACGUAAAUAUUGGCG 13, AC069475 in cluster with mir-15 homolog 

(SEQ ID NO: 82) 
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miR-17 ACUGCAGTJGAAGGCACUUGU 13, AL1 38714 

(SEQ IB NO:83) 

miR-18 UAAGOTGCAUCOAGUQCAGAIXA 13, AL1 38714 

(SEQ ID NO:84) 

miR-19a UGUGCAAATJCUATJGCAAAACUG 13, AL138714 

A (SEQ ID NO: 85) 



in cluster with mlr-17 to mir'20 



In cluster with mir-17 to /n/r-20 



in cluster with mir-IT to /n/V-20 



miR-19b XXGUGCAAATXCCAUGCAAAACDG 13, AL138714, . in cluster with m/r-f 7 to mir-ZO 
A (SEQ ID NO: 86) X, AC002407 



miR-20 



miR-21 



irAAAGUGCaTTAUAGUGCAGGIIA 1 3, AL1 38714 
(SEQ ID NO: 87) 

UAGCUDAUCAGAaJGAUGOUGA 17, AC004686, 

(SEQ ID NO : 8 8 ) ^ST, BF326048 



in cluster with mir-U to mlr-lO 

homologs: mouse, EST, 
AA209594 



m i R-22 AAGCUGCCAGUaGAAGAACnJGU 

(SEQ ID NO: 89) 



ESTs, 

AW961681t, 
AA456477, 
AI752503, 
BF030303, 
HS1 242049 



human ESTs highly similar; 
homologs: mouse, ESTs, e.g. 
AA823029; rat, ESTs, e.g. 
BF543690 



miR-23 ATTCACATjnGCCAGGGAXJUUCC 19, AC020916 homologs: mouse, EST, 

(SEQ ID NO: 90) AW124037;rat, EST, BF402515 



miR-24 TJGGCUCAGUUCAGCAGGAACAG 9, AF043896, 

(SEQ ID N0:91) 19, AC020916 



homoiogs: mouse, ESTs, 
AA111466, A1286629; pig. EST, 
BE030976 



miR-25 CATTUGCACUUGaCUCGGUCUGA 7, AC073842, 

(SEQ ID NO: 92) EST, BE077684 



human chr 7 and EST identical; 
highly similar precursors in 
mouse ESTs (e.g. AI595464); fish 
precursor different STS: G46757 



miR-26a 



TJUCAAGUAAUCCAGGAUAGGar 3, AP000497 
(SEQ ID NO: 93) 
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miR-26b OTCAAGDAAUUCAGGAUAGGmr 2, AC021016 
(SEQ XD ZIOj94) 

miR-27 UUCACAGtJGGCUAAGTmCCGCir 1 9, AC2091 6 
(SEQ ID NO: 95) 



U22C mutation in human genomic 
sequence 



10 



miR-28 



miR-29 



miR-30 



miR-31 



miR-32 



15 miR-33 



AAGCSAGCUCACAGUCUAUUGAG 3, AC063932 
(SEQ ID NO: 96) 

CtJAGCACCAUCUGAAAUCGGUtJ 7, AF017104 
(SEQ XD NO: 97) 



CU U U CAG0CGGAU6UUaGCAGC 
(SEQ ID NO: 98) 

GGCAAGAUGOIGGCAUAGCUG 
(SEQ ID NO: 99) 

UAUUGCACAUaACUAAGUUGC 
(SEQ ID NO: 100) 

GUGCAUUGOAGDUGCAtrCJG 
(SEQ ID NO: 101) 



6, AL035467 



9, AL353732 



9, AL354797 



22, Z99716 



not detected by Northern blotting 



not detected by Northern blotting 



m ■ 

*lf several ESTs were retrieved for one organism in the database, only those 
with different precursor sequences are listed. 
20 t precursor structure shown in Fig. 4. 
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Claims 

Isolated nucleic acid molecule comprising 

(a) a nucleotide sequence as shown in Table 1, Table 2, Table 3 or 
Table 4 or a precursor thereof as shown in Figure 3, Figure 4 or 
Figure 7. 

(b) a nucleotide sequence which is th,e complement of (a), 

(c) a nucleotide sequence which has an identity of at least 80% to a" 
sequence of (a) or (b) and/or 

(d) a nucleotide sequence which hybridizes under stringent conditions 
/ to a sequence of (a), (b) and/or (c). 

The nucleic acid molecule of claim 1, wherein the identity of sequence 
(c) is at least 90%. 

The nucleic acid molecule of claim 1, wherein-the identity of sequence 
(c) is at least 95%. 

The nucleic acid molecule of any one of claims 1-3, which is selected 
from miR 1 -1 4 as shown in Table 1 or miR 1 5-33 as shown in Table 2 or 
miR 1-155 as shown in Table 3 or miR-C1-34 as shown in Table 4 or a 
complement thereof. 

The nucleic acid molecule of any one of claims 1-3, which is selected 
from mir 1-14 as shown in Figure 3 or let 7a-7f or mir 15-33, as shown 
in Figure 4 or let 7a-i or mir 1 -1 55 or mir-cl -34, as shown in Figure 7 or 
a complement thereof. 
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6. The nucleic acid molecule of any one of claims 1-4 which is a miRNA 
molecule or an analog thereof having a length of from 18-25 nucleotides. 



7. The nucleic acid molecule of any one of claims 1-3 or 5, which is a 
miRNA precursor molecule having, a length of 60-80 nucleotides or a 
DNA molecule coding therefor. 

8. The nucleic acid molecule of any one of claims. 1-7, which is single- 
stranded. 

9. The nucleic acid molecule of any one of claims ^r7, which is^at least 
partially double-stranded. 

10. The nucleic acid molecule of any one of claims 1-9, which is selected 
from RNA, DNA or nucleic acid analog molecules. 

1 1 . The nucleic acid molecule of claim 10, which is a molecule containing at 
least one modified nucleotide analog. 



12. The nucleic molecule of claim 10 which is a recombinant expression 
vector. 



13. A pharmaceutical composition containinig as an active agent at least one 
nucleic acid molecule of any one of claims 1-12 and optionally a pharma- 
ceutically acceptable carrier. 

14. The composition of claim 13 for diagnostic applications. 

15. The composition of claim 13 for therapeutic applications. 

1 6. The composition of any one of claims 1 3-1 5 as a marker or a modulator 
for developmental or pathogenic processes. 
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1 7. The composition of claim 1 3 as a marker or modulator of developmental 
disorders, particularly cancer, such a B-cell chronic leukemia. 

1 8. The composition of any one of claims .1 3-1 5 as a marker or modulator of 
gene expression. 

1 9.. The composition of claim 1 S as a marker or modulator of the expression 
of a gene, which is at least partially complementary to said nucleic acid 
molecule. 

20. A method of identifying 'microRNA molecules or precursor molecules 
thereof comprising ligating 5'- and 3'-adapter molecules to the ends of a 
size-fractionated RNA population, reverse transcribing said adapter- 
containing RNA population and characterizing the reverse transcription 

products. 
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C. elegans lin-4 dcccugagaccuc— -;Aag-1ogdga 

D. melanogaster miR-1 25 ucccuGAGACCCT—AACiraGDGft. 
M. musculus/H. sapiens miR-1 25b ucccugagacc cu.-^a acotjguga 
M. musculus/H. sapiens miR-1 25a ucccuGAGACCcpiiiuAAcajGUGft. 
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<110> Max-Planck-Gesellschaft zur Ffirderung der Wissensc 

<120> Identification of novel genes encoding for small 
temporal RNA 

<130> 26250PWO_DR 

<140> 
<141> 

<150> EP 01 123 453.1 
<151> 2001-09-28 

<150> EP 02 006 712.0 
<151> 2002-03-22 

<150> EP 02 016 772.2 
<151> 2002-07-26 

<160> 217 

<170> Patentin Ver. 2.1 

<210> 1 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 1 

tactatacaa cctactacct caatttgcc 29 



<210> 2 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 2 



1 
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actatgcaac ctactacctc t 



21 



<210> 3 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 3 

actatacaac ctcctacctc a 21 

<210> 4 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 



<210> 5 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 5 

tggaatgtaa agaagtatgg ag 22 

<210> 6 
<211> 23 
<212> DNA 

<213> Artificial Sequence 



<400> 4 



tggtgtttcc gcccgggaa 



19 



<220> 



2 



wo 03/029459 



<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 6 

gctcctcaaa gctggctgtg ata 

<210> 7 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 7 

tgagacacac tttgcccagt ga 

<210> 8 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide 

<400> 8 

tcaatggttg tctagcttta t 

<210> 9 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide 

<400> 9 

catatcacaa cgatcgttcc ttt 



<210> 10 
<211> 22 



wo 03/029459 



<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide 

<400> 10 

aaaaagaaca gccactgtga ta 



<210> 11 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide 

<400> 11 • 

tggaagacta gtgattttgt tgt 

<210> 12 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide 

<400> 12 

gacatcttta cctgacagta tta 

<210> 13 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide 

<400> 13 

tcatacagct agataaccaa aga 



wo 03/029459 



<210> 14 
<211> 21 . 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence; 
Oligonucleotide 

<400> 14 

acaaattcgg atctacaggg t 

<210> 15 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide 

<400> 15 

gcaagaactc agactgtgat g 

<210> 16 

<211> 23 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide 

<400> 16 

accagtacct gatgtaatac tea 

<210> 17 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
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Oligonucleotide 
<400> 17 

actcgtcaaa atggctgtga ta 

<210> 18 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide 

<400> 18 

taggagagag aaaaagactg a 



<210> 19 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide 

<400> 19 

tagcagcaca taatggtttg t 

<210> 20 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide 

<400> 20 

gccaatattt acgtgctgct a 

<210> 21 
<211> 22 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 21 

tacaagtgcc ttcactgcag ta 

<210> 22 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 22 

tatctgcact agatgcacct ta 

<210> 23 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide 

<400> 23 

tcagttttgc atagatttgc aca 

<210> 24 

<211> 22 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide 

<400> 24 

tacctgcact ataagcactt ta 



wo 03/029459 



<210> 25 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
01 i gonucl eo t ide 

<400> 25 

tcaacatcag tctgataagc ta 

<210> 26 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide 

<400> 26 

acagttcttc aactggcagc tt 

<210> 27 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 27 

ggaaatccct ggcaatgtga t 

<210> 28 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 
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<400> 28 

ctgttcctgc tgaactgagc ca 



<210> 29 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide 

<400> 29 

tcagaccgag acaagtgcaa tg 



<210> 30 
<211> 22 
.<212> DNA 

<213> Artificial' Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide 

<400> 30 

agcctatcct ggattacttg aa 



<210> 31 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide 

<400> 31 

agcggaactt agccactgtg aa 



<210> 32 
<211> 22 
<212> DNA 

<213> Artificial Sequence 



wo 03/029459 



<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 32 

ctcaatagac tgtgagctcc tt 

<210> 33 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 33 

aaccgatttc agatggtgct ag 

<210> 34 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide 

<400> 34 

gctgcaaaca tccgactgaa ag 

<210> 35 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial . Sequence 
Oligonucleotide 

<400> 35 

cagctatgcc agcatcttgc ct 
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<210> 36 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 36 

gcaacttagt aatgtgcaat a 

<210> 37 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 37 

tgcaatgcaa ctacaatgca cc 



<210> 38 
<211> 22 
<212> DNA 

<213> Artificial Sequence 

,' 

<220> 

<223> Description of Artificial Sequence 
Oligonucleotide 

<400> 38 

ctccatactt ctttacattc ca 

<210> 39 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide 
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<400> 39 

gctgagtgta ggatgtttac a 

<210> 40 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide 

<400> 40 

gcttccagtc gaggatgttt aca 

<210> 41 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide 

<400> 41 

cgcaaggtcg gttctacggg tg 

<210> 42 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide 

<400> 42 

tcagttatca cagtactgta 

<210> 43 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence 
Oligonucleotide 

<400> 43 

acaaacacca ttgtcacact cca 

<210> 44 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide 

<400> 44 

tggcattcac cgcgtgcctt a 

<210> 45 
<211> 23 
<212> DNA 

<213> TVrtificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide 

<400> 45 

cacaggttaa agggtctcag gga 

<210> 4 6 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide 

<400> 46 

tcacaagtta gggtctcagg ga 
<210> 47 
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<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 47 

agccaagctc agacggatcc ga 

<210> 48 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 48 

aaaagagacc ggttcactct ga 

n 

<210> 49 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 49 

gcaagcccag accgaaaaaa g 

<210> 50 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 50 



wo 03/029459 
gcccttttaa cattgcactc 



<210> 51 
<211> 21 
<212> DNA 

<213> Artificial Sequence* 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 51 

actttcggtt atctagcttt a 

<210> 52 

<211> 23 

<212> DNA 

<213> TVrtificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 52 

acgaccatgg ctgtagactg tta 

<210> 53 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Combined DNA/RNA Molecule 
Oligonucleotide 

<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 53 

tgagctacag tgcttcatct ca 

<210> 54 
<211> 18 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide 

<400> 54 

uuuaaccgcg aattccag 

<210> 55 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide 

<400> 55 

acggaattcc tcactaaa 

<210> 56 

<211> 23 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 

.• Oligonucleotide 

<400> 56 

cacaggttaa agggtctcag gga 

<210> 57 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
Oligonucleotide 

<400> 57 

cagccaacgg aattcctcac taaa 
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<210> 58 
<211> 22 
<212> RNA 

<213> D. melanogaster 
<400> 58 

uggaauguaa agaaguaugg ag 22 



<210> 59 
<211> 23 
<212> RNA 

<213> D. melanogaster 
<400> 59 

uaucacagcc agcuuugaug age 

<210> 60 
<211> 23 
<212> RNA 

<213> D. melanogaster 
<400> 60 

uaucacagcc agcuuugagg age 



<210> 61 
<211> 22 
<212> RNA 

<213> D. melanogaster 
<400> 61 

ucacugggca aagugugucu ca 22 



<210> 62 
<211> 21 
<212> RNA 

<213> D. melanogaster 
<400> 62 

auaaageuag acaaccauug a 21 



<210> 63 
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<211> 23 



<212> RNA 



<213> D. melanogaster 



<400> 63 



aaaggaacga ucguugugau aug 



23 



<210> 64 
<211> 22 
<212> RNA 

<213> D. melanogaster 
<400> 64 

uaucacagug gcuguucuuu uu 22 

<210> 65 
<211> 23 
<212> RNA 

<213> D. melanogaster 



<210> 66 
<211> 23 
<212> RNA 

<213> D. melanogaster 
<400> 66 

uaauacuguc agguaaagau guc 23 

<210> 67 
<211> 23 
<212> RNA 

<213> D. melanogaster 



<400> 65 



uggaagacua gugauuuugu ugu 



23 



<400> 67 



ucuuugguua ucuagcugua uga 



23 



<210> 68 
<211> 21 
<212> RNA 



<213> D. melanogaster 
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<400> 68 

acccuguaga uccgaauuug u 21 



<210> 69 
<211> 21 
<212> RNA 

<213> D. melanogaster 
<400> 69 

caucacaguc ugaguucuug c 21 



<210> 70 
<211> 23 
<212> RNA 

<213> D. melanogaster 
<400> 70 

ugaguauuac aucagguacu ggu 23 



<210> 71 
<211> 22 
<212> RNA 

<213> D. melanogaster 
<400> 71 

uaucacagcc auuuugacga gu 22 



<210> 72 
<211> 22 
<212> RNA 

<213> D. melanogaster 
<400> 72 

uaucacagcc auuuugauga gu 22 



<210> 73 
<211> 21 
<212> RNA 

<213> D. melanogaster 
<400> 73 

ucagucuuuu ucucucuccu a 21 
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<210> 74 
<211> 22 
<212> RNA 

<213> D. melanogaster 
<400> 74 

ugagguagua gguuguauag uu 22 



<210> 75 
<211> 22 
<212> RNA 
<213> HtoKian 

<400> 75 

ugagguagua gguuguauag uu 22 



<210> 76 
<211> 22 
<212> RNA 
<213> Hvuaan, 

<400> 76 

ugagguagua gguugugugg uu 22 



<210> 77 
<211> 22 
<212> RNA 
<213> Human 

<400> 77 

ugagguagua gguuguaugg uu 22 



<210> 78 
<211> 21 
<212> RNA 
<213> Human 

<400> 78 

agagguagua gguugcauag u 21 



<210> 79 
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<211> 21 
<212> RNA 



<213> Human 



<400> 79 



ugagguagga gguuguauag u 



21 



<21D> 80 
<211> 22 
<212> RNA 
<213> Human 

<400> 80 

ugagguagua gauuguauag uu 22 

<210> 81 
<211> 22 
<212> RNA 
<213> Human 



<210> 82 
<211> 22 
<212> RNA 
<213> Human 

<400> 82 

uagcagcacg uaaauauugg eg 22 

<210> 83 
<211> 20 
<212> RNA 
<213> Human 



<400> 81 



uagcagcaca uaaugguuug ug 



22 



<400> 83 



acugcaguga aggcacuugu 



20 



<210> 84 
<211> 22 



<212> RNA 



<213> Human 
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<400> 84 

uaaggugcau cuagugcaga ua 22 



<210> 85 
<211> 23 
<212> RNA 
<213> Human 

<400> 85 

ugugcaaauc uaugcaaaac uga 23 



<210> 86 
<211> 23 
<212> RNA 
<213> Hiiman 

<400> 86 

ugugcaaauc caugcaaaac uga 23 

<210> 87 
<211> 22 
<212> RNA 
<213> Human 

<400> 87 

uaaagugcuu auagugcagg ua 22 



<210> 88 
<211> 22 
<212> RNA 
<213> Human 

<400> 88 

uagcuuauca gacugauguu ga 22 



<210> 89 
<211> 22 
<212> RNA 
<213> Human 



<400> 89 

aagcugccag uugaagaacu gu 22 

22 
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<210> 90 
<211> 21 
<212> RNA 
<213> Human 

<400> 90 

aucacauugc cagggauuuc c 21 



<210> 91 
<211> 22 
<212> RNA 
<213> Human 

<400> 91 

uggcucaguu cagcaggaac ag 22 

<210> 92 
<211> 22 
<212> RNA 
<213> Human 

<400> 92 

cauugcacuu gucucggucu ga * 22 



<210> 93 
<211> 22 
<212> RNA 
<213> Human 

<400> 93 

uucaaguaau ccaggauagg cu 22 



<210> 94 
<211> 22 
<212> RNA 
<213> Human 

<400> 94 

uucaaguaau ucaggauagg uu 22 



<210> 95 
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<211> 22 



<212> RNA 
<213> Human 



<400> 95 



uucacagugg cuaaguuccg cu 



22 



<210> 96 
<211> 22 
<212> RNA 
<213> Human 

<400> 96 

aaggagcuca cagucuauug ag 22 

<210> 97 
<211> 22 
<212> RNA 
<213> Human 



<210> 98 
<211> 22 
<212> RNA 
<213> Human 

<400> 98 

cuuucagucg gauguuugca gc 22 

<210> 99 
<211> 21 
<212> RNA 
<213> Human 



<400> 97 



cuagcaccau cugaaaucgg uu 



22 



<400> 99 



ggcaagaugc uggcauagcu g 



21 



<210> 100 



<211> 21 



<212> RNA 



<213> Human 
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<400> 100 

uauugcacau uacuaaguug c 21 



<210> 101 
<211> 19 
<212> RNA 
<213> Hioman 

<400> 101 

gugcauugua guugcauug 19 



<210> 102 
<211> 22 
<212> RNA 
<213> Human 

<400> 102 

uggaauguaa agaaguaugg ag 22 



<210> 103 
<211> 23 
<212> RNA 
<213> Hxjman 

<400> 103 

uggaagacua gugauuuugu ugu 23 



<210> 104 
<211> 23 
<212> RNA 
<213> Human 

<400> 104 

ucuuugguua ucuagcugua uga 23 



<210> 105 

<211> 21 

<212> RNA 

<213> Human 

<400> 105 

acccuguaga uccgaauuug u 21 
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<210> 106 
<211> 22 
<212> RNA 
<213> Mouse 

<400> 106 

ugagguagua gguuguauag uu 22 



<210> 107 
<211> 22 
<212> RNA 
<213> Mouse 

<400> 107 

ugagguagua gguugugugg uu 



22 



<210> 108 
<211> 22 
<212> RNA 
<213> Mouse 

<400> 108 

ugagguagua gguuguaugg uu 22 

<210> 109 
<211> 21 
<212> RNA 
<213> Mouse 

<400> 109 

agagguagua gguugcauag u 21 



<210> 110 
<211> 21 
<212> RNA 
<213> Mouse 

<400> 110 

ugagguagga gguuguauag u 21 



<210> 111 
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<211> 22 
<212> E<NA 
<213> Mouse 

<400> 111 

ugagguagua gauuguauag uu 22 



<210> 112 

<211> 22 

<212> RNA 

<213> Mouse 

<400> 112 

ugagguagua guuuguacag ua 22 



<210> 113 
<211> 22 
<212> RNA 
<213> Mouse 

<400> 113 

ugagguagua guguguacag uu 22 



<210> 114 
<211> 19 
<212> RNA 
<213> Mouse . 

<400> 114 

ugagguagua guuugugcu 19 



<210> 115 
<211> 22 
<212> RNA 
<213> Mouse 

<400> 115 

uggaauguaa agaaguaugu aa 22 



<210> 116 

<211> 22 

<212> RNA 

<213> Mouse 
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<400> 116 

uggaauguaa agaaguaugu ac 22 



<210> 117 
<211> 23 
<212> RNA 
<213> Mouse 

<400> 117 

uggaauguaa agaaguaugu auu .23 



<210> 118 
<211> 23 
<212> RNA 
<213> Mouse 

<400> 118 

ucuuugguua ucuagcugua uga 23 



<210> 119 
<211> 22 
<212> RNA 
<213> Mouse 

<400> 119 

uagcagcaca uaaugguuug ug 22 



<210> 120 
<211> 22 
<212> RNA 
<213> Mouse 

<400> 120 

uagcagcaca ucaugguuua ca 22 



<210> 121 
<211> 22 
<212> RNA 
<213> Mouse 

<400> 121 

uagcagcacg uaaauauugg eg 22 
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<210> 122 
<211> 22 
<212> RNA 
<213> Mouse 

<400> 122 

uaaggugcau cuagugcaga ua 22 



<210> 123 
<211> 23 
<212> RNA 
<213> Mouse 

<400> 123 

ugugcaaauc caugcaaaac uga 23 



<210> 124 
<211> 23 
<212> RNA 
<213> Mouse 

<400> 124 

uaaagugcuu auagugcagg uag 23 



<210> 125 
<211> 22 
<212> RNA 
<213> Mouse 

<400> 125 

uagcuuauca gacugauguu ga 22 



<210> 126 
<211> 22 
<212> RNA 
<213> Mouse 

<400> 126 

aagcugccag uugaagaacu gu 22 



<210> 127 
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<211> 21 
<212> RNA 
<213> Mouse 

<400> 127 

aucacauugc cagggauuuc c 21 



<210> 128 
<211> 23 
<212> RNA 
<213> Mouse 

<400> 128 

aucacauugc cagggauuac cac 23 



<210> 129 
<211> 22 
<212> RNA 
<213> Mouse 

<400> 129 

uggcucaguu cagcaggaac ag 22 



<210> 130 
<211> 22 
<212> RNA 
<213> Mouse 

<400> 130 

.uucacLguaau ccaggauagg cu 22 



<210> 131 
<211> 22 
<212> RNA 
<213> Mouse 

<400> 131 

uucaaguaau ucaggauagg uu 22 



<210> 132 
<211> 22 
<212> RNA 
<213> Mouse 
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<400> 132 

uucacagugg cuaaguuccg cu , 22 



<210> 133 
<211> 20 
<212> RNA 
<213> Mouse 

<400> 133 

uucacagugg cuaaguucug 20 



<210> 134 
<211> 22 
<212> RNA 
<213> Mouse 

<400> 134 

cuagcaccau cugaaaucgg uu 22 



<210> 135 
<211> 23 
<212> RNA 
<213> Mouse 

<400> 135 

uagcaccauu ugaaaucagu guu 23 



<210> 136 
<211> 22 
<212> RNA 
<213> Mouse 

<400> 136 

uagcaccauu ugaaaucggu ua 22 



<210> 137 
<211> 23 
<212> RNA 
<213> Mouse 

<400> 137 

uguaaacauc cucgacugga age 23 
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<210> 138 
<211> 22 
<212> RNA 
<213> Mouse 

<400> 138 

cuuucagucg gauguuugca gc 22 



<210> 139 
<211> 21 
<212> RNA 
<213> Mouse 



<400> 139 

uguaaacauc cuacacucag c 21 



<210> 140 
<211> 23 
<212> RNA 
<213> Mouse. 

<400> 140 

uguaaacauc cuacacucuc age 23 



<210> 141 
<211> 22 
<212> RNA 
<213> Mouse 

<400> 141 

uguaaacauc cccgacugga ag 22 



<210> 142 
<211> 20 
<212> RNA 
<213> Mouse 

<400> 142 

acccguagau ccgaucuugu 20 



<210> 143 
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<211> 22 
<212> RNA 
<213> Mouse 

<400> 143 

cacccguaga accgaccuug eg 22 

<210> 144 
<211> 20 
<212> RNA 
<213> Mouse 

<400> 144 

uacaguacug ugauaacuga 20 



<210> 145 
<211> 23 
<212> RNA 
<213> Mouse 

<400> 145 

uggaguguga caaugguguu ugu 23 



<210> 146 
<211> 23 
<212> RNA 
<213> Mouse 

<400> 146 

uggaguguga caaugguguu uga 23 



<210> 147 
<211> 22 
<212> RNA 
<213> Mouse 

<400> 147 

uggaguguga caaugguguu ug 22 



<210> 148 
<211> 21 
<212> RNA 
<213> Mouse 
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<400> 148 

cauuauuacu uuugguacgc g 2i 



<210> 149 
<211> 22 
<212> RNA 
<213> Mouse 

<400> 149 

uuaaggcacg cggugaaugc ca 22 



<210> 150 
<211> 21 
<212> RNA 
<213> Mouse 

<400> 150 

uuaaggcacg cgggugaaug c 21 



<210> 151 

4 

<211> 23 
<212> RNA 
<213> Mouse 

<400> 151 

ucccugagac ccuuuaaccu gug 23 



<210> 152 
<211> 22 
<212> RNA 
<213> Mouse 

<400> 152 

ucccugagac ccuaacuugu ga 22 



<210> 153 
<211> 21 
<212> RNA 
<213> Mouse 

<400> 153 

ucguaccgug aguaauaaug c 21 
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<210> 154 
<211> 22 
<212> RNA 
<213> Mouse 

<400> 154 

ucggauccgu cugagcuugg cu 22 



<210> 155 

<211> 22 

<212> mA 

' <213> Mouse 

<400> 155 

ucacagugaa ccggucucuu uu 22 



<210> 156 
<211> 21 
<212> RNA 
<213> Mouse. 

<400> 156 

cuuuuuucgg ucugggcuug c 21 



<210> 157 
<211> 20 
<212> RNA 
<213> Mouse 

<400> 157 

cagugcaaug uuaaaagggc 20 



<210> 158 
<211> 21 
<212> RNA 
<213> Mouse 

<400> 158 

uaaagcuaga uaaccgaaag u 21 



<210> 159 
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<211> 23 
<212> RNA 
<213> Mouse 

<400> 159 

uaacagucua cagccauggu cgu 23 



<210> 160 
<211> 22 
<212> RNA 
<213> Mouse 



<400> 160 

uugguccccu ucaaccagcu gu 22 



<210> 161 
<211> 22 
<212> RNA 
<213> Mouse 

<400> 161 

ugugacuggu ugaccagagg ga 22 



<210> 162 
<211> 24 
<212> RNA 
<213> Mouse 

<400> 162 

uauggcuuuu uauuccuaug ugaa 24 



<210> 163 
<211> 23 
<212> RNA 
<213> Mouse 

<400> 163 

acuccauuug uuuugaugau gga 23 



<210> 164 
<211> 22 
<212> RNA 
<213> Mouse 
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<400> 164 

uauugcuuaa gaauacgcgu ag 22 



<210> 165 
<211> 17 
<212> RNA 
<213> Mouse 

<400> 165 

agcuggugiau gugaauc 17 



<210> 166 
<211> 18 
<212> RNA 
<213> Mouse 

<400> 166 

ucuacagugc acgugucu 18 



<210> 167 
<211> 21 
<212> RNA 
<213> Mouse 

<400> 167 

agugguuuua cccuauggua g 21 



<210> 168 
<211> 21 
<212> RNA 
<213> Mouse 

<400> 168 

aacacugucu gguaaagaug g 21 



<210> 169 
<211> 20 
<212> RNA 
<213> Mouse 

<400> 169 

cauaaaguag aaagcacuac 20 
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<210> 170 
<211> 22 
<212> RNA 
<213> Mouse 

<400> 170 

uguaguguuu ccuacuuuau gg 22 



<210> 171 
<211> 22 
<212> RNA 
<213> Mouse 

<400> 171 

ugagaugaag cacuguagcu ca 22 



<210> 172 
<211> 22 
<212> RNA 
<213> Mouse 

<400> 172 

uacaguauag augauguacu ag 22 



<210> 173 
<211> 24 
<212> RNA 
<213> Mouse 

<400> 173 

guccaguuuu cccaggaauc ccuu 24 



<210> 174 
<211> 23 
<212> RNA 
<213> Mouse 

<400> 174 

ugagaacuga auuccauggg uuu 23 



<210> 175 
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<211> 21 
<212> RNA 
<213> Mouse 

<400> 175 

guguguggaa augcuucugc c 



<210> 176 
<211> 22 
<212> RNA 
<213> Mouse 



<400> 176 

ucagugcacu acagaacuuu gu 22 



<210> 177 
<211> 22 
<212> RNA 
<213> Mouse 

<400> 177 

ucuggcuccg ugucuucacu cc 22 



<210> 178 
<211> 23 
<212> RNA 
<213> Mouse 

<400> 178 

ucucccaacc cuuguaccag ugu 23 



<210> 179 
<211> 22 
<212> RNA 
<213> Mouse 

<400> 179 

cuagacugag gcuccuugag gu 22 



<210> 180 
<211> 21 
<212> RNA 
<213> Mouse 
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<400> 180 

ucagugcaug acagaacuug g 21 



<210>.181 
<211> 20 
<212> RNA 
<213> Mouse 

<400> 181 

uugcauaguc acaaaaguga 20 



<210> 182 
<211> 22 
<212> RNA 
<213> Mouse 

<400> 182 

uagguuaucc guguugccuu eg 22 



<210> 183 
<211> 22 
<212> RNA 
<213> Mouse 

<400> 183 

uuaaugcuaa uugugauagg gg 22 



<210> 184 

<211> 23 

<212> RNA 

<213> Hiiman/Mouse 

<400> 184 

aacauucaac gcugucggug agu 23 



<210> 185 

<211> 22 

<212> RNA 

<213> Human/Mouse 

<400> 185 

uuuggcaaug guagaacuca ca 22 
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<210> 186 

<211> 23 

<212> RNA 

<213> Huinan/Mouse 

<400> 186 

uauggcacug guagaauuca cug 23 



<210> 187 

<211> 22 

<212> RNA 

<213> Human/Mouse 

<400> 187 

cuuuuugcgg ucugggcuug uu 22 



<210> 188 
<211> 22 
<212> RNA 
<213> Human/Mouse 

<400> 188 

uggacggaga acugauaagg gu 22 



<210> 189 

<211> 18 

<212> RNA 

<213> Human/Mouse 

<400> 189 

uggagagaaa ggcaguuc 



<210> 190 
<211> 23 
<212> RNA 
<213> Human/Mouse 

<400> 190 

caaagaauuc uccuuuuggg cuu 23 



<210> 191 
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<211> 22 

<212> RNA 

<213> Human/Mouse 

<400> 191 

ucgugucuug uguugcagcc gg 22 



<210> 192 
<211> 21 
<212> RNA 
<213> Human/Mouse 

<400> 192 

uaacacuguc ugguaacgau g * 21 



<210> 193 
<211> 22 
<212> RNA 
<213> Human/Mouse 

<400> 193 ' 

caucccuugc augguggagg gu 22 



<210> 194 
<211> 23 
<212> RNA 
<213> Human/Mouse 

<400> 194 

gugccuacug agcugacauc agu 23 



<210> 195 
<211> 22 
<212> RNA 
<213> Human/Mouse 

<400> 195 

ugauauguuu gauauauuag gu 22 



<210> 196 
<211> 22 
<212> RNA 
<213> Hiiman/Mouse 
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<400> 196 

caacggaauc ccaaaagcag cu 22 



<210> 197 
<211> 18 
<212> RNA 
<213> Human/Mouse 

<400> 197 

cugaccuaug aauugaca 18 



<210> 198 
<211> 22 
<212> RNA 
<213> H^aman/Mouse 

<400> 198 

uaccacaggg uagaaccacg ga 22 



<210> 199 
<211> 21 
<212> RNA 
<213> Human/Mouse 

<400> 199 

aacuggccua caaaguccca g 21 



<210> 200 
<211> 22 
<212> RNA 
<213> Human/Mouse 

<400> 200 

uguaacagca acuccaugug ga 22 



<210> 201 
<211> 21 
<212> RNA 
<213> Human/Mouse 

<400> 201 

uagcagcaca gaaauauugg c 21 
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<210> 202 

<211> 21 

<212> RNA 

<213> Hiaman/Mouse 

<400> 202 

uagguaguuu cauguuguug g "21 



<210> 203 

<211> 22 

<212> RNA 

<213> Huiaan/Mouse 

<400> 203 

uucaccaccu ucuccaccca gc 22 



<210> 204 

<211> 19 

<212> RNA 

<213> Hiaman/Mouse 

<400> 204 

gguccagagg ggagauagg 19 



<210> 205 

<211> 22 

<212> RNA 

<213> Human/Mouse 

<400> 205 

cccaguguuc agacuaccug uu 22 



<210> 206 

<211> 23 

<212> RNA 

<213> Human/Mouse 

<400> 206 

uaauacugcc ugguaaugau gac 23 



<210> 207 
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<211> 21 

<212> RNA 

<213> Human/Mouse 

<400> 207 

uacucaguaa ggcauuguuc u 21 



<210> 208 

<211> 22 

<212> RNA 

<213> Human/Mouse 

<400> 208 

agagguauag cgcaugggaa ga 22 



<210> 209 

<211> 21 

<212> RNA 

<213> Human /Mouse 

<400> 209 

ugaaauguuu aggaccacua g 21 



<210> 210 

<211> 23 

<212> RNA 

<213> Human/Mouse 

<400> 210 

uucccuuugu cauccuaugc cug 23 



<210> 211 

<211> 22 

<212> RNA 

<213> Human/Mouse 

<400> 211 

uccuucauuc caccggaguc ug 22 



<210> 212 

<211> 23 

<212> RNA 

<213> Hxaman/Mouse 
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<400> 212 

gugaaauguu uaggaccacu aga 



23 



<210> 213 
<211> 22 
<212> RNA 
<213> Human/Mouse 



<400> 213 

uggaauguaa ggaagugugu gg 



22 



<210> 214 

<211> 22 

<212> RNA 

<213> Human /Mouse 



<400> 214 

uacaguaguc ugcacauugg uu 



22 



<210> 215 

<211> 22 

<212> RNA 

<213> Human/Mouse 



<400> 215 

cccuguagaa ccgaauuugu gu 



22 



<210> 216 
<211> 24 
<212> RNA 
<213> Human/Mouse 



<400> 216 

aacccguaga uccgaacuug ugaa 



24 



<210> 217 
<211> 23 
<212> RNA 
<213> Human/Mouse 



<400> 217 

gcuucuccug gcucuccucc cue 



23 
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